[R] read in Stata and SPSS with value labels/formats
Xu Jun
junxu.r at gmail.com
Thu Jan 19 20:19:55 CET 2012
Sorry I forgot the subject line last time
Dear R experts,
I am using the foreign package to read in Stata and SPSS format data
files (same data but I tried different format). I first tried using
read.dta for the Stata format:
##########################
> library(foreign)
> mystata <- read.dta("data/hlthintl.dta", convert.factor=FALSE)
Error in read.dta("data/hlthintl.dta", convert.factor = FALSE) :
a binary read error occurred
##########################
Then I tried saving this Stata file to an old version without labels in Stata
************************************
use "data\hlthintl.dta", clear
saveold "data\hlthintlold.dta", nolabel
************************************
Then I read the hlthintlold.dta into R without problems, but of course
without value labels. Well, to keep these value labels, I turned to
SPSS. Here is what I did and got:
#########################
>myspss <- read.spss("data/hlthintl.sav", use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In read.spss("data/hlthintl.sav", ... :
data/hlthintl.sav: File contains duplicate label for value 276.2 for
variable V4
2: In read.spss("data/hlthintl.sav", ... :
data/hlthintl.sav: File contains duplicate label for value 376.2 for
variable V4
3: In read.spss("data/hlthintl.sav", ... :
data/hlthintl.sav: File contains duplicate label for value 826.2 for
variable V4
4: In xi >= z[1L] | xi <= z[2L] | xi[xi == z[3L]] :
longer object length is not a multiple of shorter object length
5....
6....
...
...
50.....
########################
Warnings 5-50 are the same as warning 4. Now I can have most data
transferred into the R system correctly except when I check an
occupation variable, it lost all its numeric coding (frequencies are
all zero)
########################
> table(myspss$occupation)
ARMED FORCES
0
Soldiers
0
Officers
0
...
...
...
...
Hand packers and other manufacturing labourers
0
TRANSPORT LABOURERS AND FREIGHT HANDLERS
0
Hand or pedal vehicle drivers
0
Drivers of animal-drawn vehicles and machinery
0
Freight handlers
0
Refused
0
Dont know
0
Warning message:
In `levels<-`(`*tmp*`, value = c("ARMED FORCES", "Soldiers", "Officers", :
duplicated levels will not be allowed in factors anymore
########################################
Any thoughts or suggestions? Thanks a lot!
Jun Xu, PhD
Assistant Professor
Department of Sociology
Ball State University
More information about the R-help
mailing list