[R] read in Stata and SPSS with value labels/formats

Frank Harrell f.harrell at vanderbilt.edu
Thu Jan 19 23:45:46 CET 2012


require(Hmisc)
?spss.get

Xu Jun wrote
> 
> Sorry I forgot the subject line last time
> 
> Dear R experts,
> 
> I am using the foreign package to read in Stata and SPSS format data
> files (same data but I tried different format). I first tried using
> read.dta for the Stata format:
> 
> 
> ##########################
>> library(foreign)
>> mystata <- read.dta("data/hlthintl.dta", convert.factor=FALSE)
> Error in read.dta("data/hlthintl.dta", convert.factor = FALSE) :
>  a binary read error occurred
> ##########################
> 
> Then I tried saving this Stata file to an old version without labels in
> Stata
> ************************************
> use "data\hlthintl.dta", clear
> saveold "data\hlthintlold.dta", nolabel
> ************************************
> 
> Then I read the hlthintlold.dta into R without problems, but of course
> without value labels. Well, to keep these value labels, I turned to
> SPSS. Here is what I did and got:
> 
> #########################
>>myspss <- read.spss("data/hlthintl.sav", use.value.labels=TRUE,
max.value.labels=Inf, to.data.frame=TRUE)
> There were 50 or more warnings (use warnings() to see the first 50)
>> warnings()
> Warning messages:
> 1: In read.spss("data/hlthintl.sav",  ... :
>  data/hlthintl.sav: File contains duplicate label for value 276.2 for
> variable V4
> 2: In read.spss("data/hlthintl.sav",  ... :
>  data/hlthintl.sav: File contains duplicate label for value 376.2 for
> variable V4
> 3: In read.spss("data/hlthintl.sav",  ... :
>  data/hlthintl.sav: File contains duplicate label for value 826.2 for
> variable V4
> 4: In xi >= z[1L] | xi <= z[2L] | xi[xi == z[3L]] :
>  longer object length is not a multiple of shorter object length
> 5....
> 6....
> ...
> ...
> 50.....
> ########################
> 
> 
> Warnings 5-50 are the same as warning 4. Now I can have most data
> transferred into the R system correctly except when I check an
> occupation variable, it lost all its numeric coding (frequencies are
> all zero)
> 
> 
> ########################
>> table(myspss$occupation)
> 
>                                                ARMED FORCES
>                                                           0
>                                                    Soldiers
>                                                           0
>                                                    Officers
>                                                           0
> ...
> ...
> ...
> ...
> 
>              Hand packers and other manufacturing labourers
>                                                           0
>                    TRANSPORT LABOURERS AND FREIGHT HANDLERS
>                                                           0
>                               Hand or pedal vehicle drivers
>                                                           0
>              Drivers of animal-drawn vehicles and machinery
>                                                           0
>                                            Freight handlers
>                                                           0
>                                                     Refused
>                                                           0
>                                                   Dont know
>                                                           0
> Warning message:
> In `levels<-`(`*tmp*`, value = c("ARMED FORCES", "Soldiers", "Officers",
>  :
>  duplicated levels will not be allowed in factors anymore
> ########################################
> 
> Any thoughts or suggestions? Thanks a lot!
> 
> Jun Xu, PhD
> Assistant Professor
> Department of Sociology
> Ball State University
> 
> ______________________________________________
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/read-in-Stata-and-SPSS-with-value-labels-formats-tp4311210p4311751.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list