[R] How can I import user-defined missings from Spss?

Christine Christmann christinechristmann at web.de
Tue Apr 15 11:51:03 CEST 2008


Hi,

It works for me to import spss datasets via library(foreign) with read.spss or via library Hmisc by (spss.get).
But no matter which way I do import the data, user-defined missings from Spss are always lost.
(it makes no difference if  there are a single value, a range, or any combination of them. They are always ignored).
Is there any way in R to find out if any value was user-defined missing in Spss or not?
Even to keep the information as an attribute would suit me fine, or to keep them as a string character like "miss" would be even better.
To transform them into "NA" as the sysmis data from Spss is transformed automatically, would be an other alternative.

Unfortunately I don't know if any of these options are possible. Could you help me out?

Let me give you an example:
Preconditions: You need to have spss on you computer to generate the spss data.
You need to generate the folder C:/tmp to save the spss file. As you can see I work with windows.

*/1) Generate the SpssData:
*/data.
DATA LIST LIST /age (f2) sport (f2).
BEGIN DATA
22, 1 
40, 2
69, 1
19, 2
-99, 9
END DATA.


*/description.
missing values age (LO thru 0).
missing values sport (9).
var label age "age".
var label sport "Do you like sports"
value label sport 
1 "yes"
2 "no"
3 "don't know".

*frequencies in Spss.
freq age sport.


save outfile = "C:\tmp\test.sav".
*-----------------------------------------------------------------------------------------.


2) Import the Spss Data in R. Via Hmisc or foreign - both work fine.

#import Spssdata in R
spssfile <- "C:/tmp/test.sav"

#via Hmisc
library(Hmisc)
Signs <- c("_")
mydata1 <- spss.get(spssfile,lowernames=TRUE, allow=Signs)

#via foreign
library(foreign)
mydata2 <- read.spss(spssfile,use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)

#freq in r
describe(mydata1)
describe(mydata2)


*-----------------------------------------------------------------------------------------.
Have a look at the two variables age and sport. In spss the values (-99) in age is a missing, as well as the value (9) in sports.
As you can see - the information about the missings in R is lost. What can I do?


Many Thanks Christine Christmann



More information about the R-help mailing list