[R] use of variable labels
janet rosenbaum
jerosenb at hcs.harvard.edu
Wed Apr 9 00:01:11 CEST 2003
> In this particular case I don't see why you would want the numbers, but
> the function as.numeric() will extract the underlying numbers from a
> factor.
The mean was just an example. We have a 4000 line program that expects
numbers. I was hoping that there would be some way of dealing with this
problem on the level of the data.frame.
I'm guessing I'm just going to have to throw out the labels since it's
not practical to cast as a number every time and I also just noticed
something strange about having convert.factors=TRUE:
When I do
read.dta("filename.dta")
some of the variables which are numbers are read as NA:
age educyrs
refuse: 0 refuse: 0
DK : 0 DK : 0
NA's :1068 NA's :1068
When I do
read.dta("filename.dta", convert.factors=FALSE)
the variables are again treated like numbers:
age educyrs
Min. :18.00 Min. : 0.00
1st Qu.:30.00 1st Qu.: 5.00
Median :41.00 Median : 9.00
Mean :43.18 Mean : 8.65
3rd Qu.:54.00 3rd Qu.:12.00
Max. :88.00 Max. :40.00
NA's :18.00 NA's :87.00
I'm guessing that this means that by default -only- the labels are used
when convert.factors=TRUE, and even variables without labels have to be
cast as numbers.
Anyhow, thanks so much for the help.
Thanks,
Janet
More information about the R-help
mailing list