[R] "Types" of missingness
ggrothendieck at gmail.com
Sun Feb 28 13:23:05 CET 2010
On Sun, Feb 28, 2010 at 2:39 AM, Christian Raschke
<crasch2 at tigers.lsu.edu> wrote:
> Dear R-List,
> My questions concerns missing values. Specifically, is is possible to
> use different "types" of missingness in a dataset and not a
> one-size-fits-all NA?
> For example, data may be missing because of an outright refusal by a
> respondent to answer a question, or because she didn't know an answer,
> or because the item simply did not apply. In later analysis it is
> sometimes useful to be able to distinguish between the cases, but
> nonetheless have them all treated as missing when using, say, lm( ).
> In Stata this is possible by using different missing value indicators.
> The standard one is a period '.' whereas '.a' and '.b' etc are treated
> as missing too, but can all be distinguished from another (they are even
> ordinal such that . < .a < .b).
> To give a simplistic example in R, let
> > dat <- data.frame(
> + hours = c(36, 40, 40, 0, 37.5, 0, 36, 20, 40),
> + wage = c( 15.5, 7.5, 8, -1, 17.5, -1, -2, 13, -2))
> > dat
> hours wage
> 1 36.0 15.5
> 2 40.0 7.5
> 3 40.0 8.0
> 4 0.0 -1.0
> 5 37.5 17.5
> 6 0.0 -1.0
> 7 36.0 -2.0
> 8 20.0 13.0
> 9 40.0 -2.0
> where for wages -1 indicates "didn't work" and -2 indicates "refused to
> respond". How could I replace the negative values for wages with
> missingness indicators to use the data frame in for instance lm( ), but
> later operate only on those observations who "refused to respond"?
> Of course I can always work around this somehow, especially in this easy
> example, but as data frames get larger and cases more complex the
> workarounds seem more and more klutzy to me.
> So, if there is an easy way to do this that I have overlooked, I would
> be grateful for any advice or references.
> Christian Raschke
> Department of Economics
> ISDS Research Lab (HSRG)
> Louisiana State University
> Patrick Taylor Hall, Rm 2128
> Baton Rouge, LA 70803
> crasch2 at lsu.edu
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help