[R] How can I import user-defined missings from Spss?

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Apr 15 12:13:26 CEST 2008


You have already had a reply to a version of this (posted from another 
address) at https://stat.ethz.ch/pipermail/r-help/2008-April/159342.html . 
'Kind souls' are likely to get exasperated when their help is 
unacknowledged.

You need SPSS and Windows to reproduce this, and this is the R forum.  To 
fulfil the footer of the message you need to make available the spss save 
file.

On Tue, 15 Apr 2008, Christine Christmann wrote:

> Hi,
>
> It works for me to import spss datasets via library(foreign) with read.spss or via library Hmisc by (spss.get).
> But no matter which way I do import the data, user-defined missings from Spss are always lost.
> (it makes no difference if  there are a single value, a range, or any combination of them. They are always ignored).
> Is there any way in R to find out if any value was user-defined missing in Spss or not?
> Even to keep the information as an attribute would suit me fine, or to keep them as a string character like "miss" would be even better.
> To transform them into "NA" as the sysmis data from Spss is transformed automatically, would be an other alternative.
>
> Unfortunately I don't know if any of these options are possible. Could you help me out?
>
> Let me give you an example:
> Preconditions: You need to have spss on you computer to generate the spss data.
> You need to generate the folder C:/tmp to save the spss file. As you can see I work with windows.
>
> */1) Generate the SpssData:
> */data.
> DATA LIST LIST /age (f2) sport (f2).
> BEGIN DATA
> 22, 1
> 40, 2
> 69, 1
> 19, 2
> -99, 9
> END DATA.
>
>
> */description.
> missing values age (LO thru 0).
> missing values sport (9).
> var label age "age".
> var label sport "Do you like sports"
> value label sport
> 1 "yes"
> 2 "no"
> 3 "don't know".
>
> *frequencies in Spss.
> freq age sport.
>
>
> save outfile = "C:\tmp\test.sav".
> *-----------------------------------------------------------------------------------------.
>
>
> 2) Import the Spss Data in R. Via Hmisc or foreign - both work fine.
>
> #import Spssdata in R
> spssfile <- "C:/tmp/test.sav"
>
> #via Hmisc
> library(Hmisc)
> Signs <- c("_")
> mydata1 <- spss.get(spssfile,lowernames=TRUE, allow=Signs)
>
> #via foreign
> library(foreign)
> mydata2 <- read.spss(spssfile,use.value.labels=TRUE, max.value.labels=Inf, to.data.frame=TRUE)
>
> #freq in r
> describe(mydata1)
> describe(mydata2)
>
>
> *-----------------------------------------------------------------------------------------.
> Have a look at the two variables age and sport. In spss the values (-99) in age is a missing, as well as the value (9) in sports.
> As you can see - the information about the missings in R is lost. What can I do?
>
>
> Many Thanks Christine Christmann
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list