[R] Problem with read.spss() and as.data.frame(), or: alternative to subset()?

Dirk Enzmann dirk.enzmann at jura.uni-hamburg.de
Wed Sep 21 13:18:32 CEST 2005


The selection problem can be solved by

dr2000=read.spss('myfile')
d=lapply(dr2000,subset,dr2000$RBINZ99 > 0)

however, there is still the problem that R crashes when using

d = as.data.frame(dr2000)

or

dr2000=read.spss('myfile',to.data.frame=T)

Any suggestions why? I checked whether all components of dr2000 are of 
the same length and the sort of object of each component. This is not 
the problem: Each component has the same length (9232) and there are 66 
components of the class 'character', 981 of the class 'factor', and 479 
of the class 'numeric'.


> Trying to select a subset of cases (rows of data) I encountered several 
> problems:
> 
> Firstly, because I did not read the help to read.spss() thoroughly 
> enough, I treated the data read as a data frame. For example,
> 
> dr2000 <- read.spss('myfile.sav')
> d <- subset(dr2000,RBINZ99 > 0)
> 
> and thus received an error message (Object "RBINZ99" not found), because 
> dr2000 is not a data.frame but a list (shown by class(dr2000)).
> 
> d <- subset(dr2000,dr2000$RBINZ99 > 0)
> 
> didn' help either, because now d is empty (dim = NULL).
> 
> Thus, I tried to use the option "to.data.frame=T" of read.spss():
> 
> dr2000 <- read.spss('myfile.sav',to.data.frame=T)
> 
> However, now R "crashes" ('R for Windows GUI front-end has found an 
> error and must be closed') (the error message is in German).
> 
> Finally, I tried again using read.spss() without the option 
> 'to.data.frame=T' (as before) and tried to convert dr2000 to a data 
> frame by using
> 
> d <- as.data.frame(dr2000)
> 
> However, R crashes again (with the same error message).
> 
> Of course, I could use SPSS first and save only the cases with RBINZ99 > 
> 0, but this is not always possible (all users of the data must have SPSS 
> available and we have to use different selection criteria). Is there 
> another possibility to solve the problem by using R? I want to select 
> certain rows (cases) based on the values of one "variable" of dr2000, 
> but keep all columns (variables) - although dr2000 is not a data frame?
> 
> And: R should not crash but rather give a warning.
> 
> ------------------------
> R version 2.1.1 Patched (2005-07-15)
> Package Foreign Version 0.8-10
> 
> Operating system: Windows XP Professional (5.1 (Build 2600))
> CPU: Pentium Model 2 Stepping 9
> RAM: 512 MB

*************************************************
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Edmund-Siemers-Allee 1
D-20146 Hamburg
Germany

phone: +49-040-42838.7498 (office)
        +49-040-42838.4591 (Billon)
fax:   +49-040-42838.2344
email: dirk.enzmann at jura.uni-hamburg.de
www: 
http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html




More information about the R-help mailing list