[R] use of variable labels

janet rosenbaum jerosenb at hcs.harvard.edu
Wed Apr 9 00:01:11 CEST 2003


 
 
> In this particular case I don't see why you would want the numbers, but
> the function as.numeric() will extract the underlying numbers from a
> factor.
 
The mean was just an example.  We have a 4000 line program that expects
numbers.  I was hoping that there would be some way of dealing with this
problem on the level of the data.frame.

I'm guessing I'm just going to have to throw out the labels since it's 
not practical to cast as a number every time and I also just noticed 
something strange about having convert.factors=TRUE:  

When I do 
read.dta("filename.dta")
some of the variables which are numbers are read as NA:
     age         educyrs              
      refuse:   0   refuse:   0   
      DK    :   0   DK    :   0 
      NA's  :1068   NA's  :1068   

When I do
read.dta("filename.dta", convert.factors=FALSE)
the variables are again treated like numbers:

      age           educyrs    
 Min.   :18.00   Min.   : 0.00
 1st Qu.:30.00   1st Qu.: 5.00
 Median :41.00   Median : 9.00
 Mean   :43.18   Mean   : 8.65
 3rd Qu.:54.00   3rd Qu.:12.00
 Max.   :88.00   Max.   :40.00
 NA's   :18.00   NA's   :87.00  

I'm guessing that this means that by default -only- the labels are used 
when convert.factors=TRUE, and even variables without labels have to be
cast as numbers.

Anyhow, thanks so much for the help.  
Thanks,

Janet



More information about the R-help mailing list