[R] is.na<- doesn't seem to work with labelled variables?

Marc Schwartz marc_schwartz at me.com
Tue Apr 6 13:42:22 CEST 2010


On Apr 6, 2010, at 6:31 AM, David Foreman wrote:

> Dear All,
> 
> I seem entirely unable to solve what should be a very simple problem.  I
> have imported a SPSS dataset into R using spss.get from Frank Harrell's
> Hmisc library.  The variables are imported classed as 'labelled': missing
> values are coded as either the SPSS missing value code, 8 or 88.  All are
> imported correctly; 8 and 88 being identified as true numbers in the
> 'summary' command, which treats them (correctly) as numeric. For some
> reason, is.na(x)<-c(8,88) doesn't seem to work. No error message is
> returned, but the 8 and 88 are not set as NA.  The example given in the help
> file does work, so is.na<- is functioning in my copy of R (2.10.1).  I've
> tried working outside the dataframe, using unclass, as.numeric, and class<-
> entirely without success.  The variable structure(exemplified by one
> variable) is
> 
> Class 'labelled'  atomic [1:827] 2 2 2 2 2 2 8 NA 2 2 ...
>  ..- attr(*, "label")= Named chr "In the last yr, you were unfairly treated
> because of your sex"
>  .. ..- attr(*, "names")= chr "c25a"
> 
> All suggestions are gratefully received!


Note the description of the 'value' argument when using is.na(x) for assignment:

  value	    a suitable index vector for use with x.


x <- c(2, 2, 2, 2, 2, 2, 8, NA, 2, 2, 88) 

is.na(x) <- x %in% c(8, 88)

> x
 [1]  2  2  2  2  2  2 NA NA  2  2 NA


The RHS needs to be a logical returning the indices in the vector to change to NA:

> x %in% c(8, 88)
 [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE


Also see ?"%in%"

HTH,

Marc Schwartz



More information about the R-help mailing list