[R] Is this a documentation bug? Spss dates import

Luca Braglia lbraglia at gmail.com
Wed Mar 11 14:46:16 CET 2009


Hello R-user

bug seekers are needed!
In order to perform these simple tasks you have to use a copy of SPSS
and obviously R.
The problem is that date conversion of data coming from SPSS
gives wrong results, if we follow ?as.POSIXct

    ## SPSS dates (R-help 2006-02-17)
     z <- c(10485849600, 10477641600, 10561104000, 10562745600)
     as.Date(as.POSIXct(z, origin="1582-10-14", tz="GMT"))


Note that ?as.POSIXct is coherent with the SPSS 'Programming and 
Data Management' guide (pag. 116):

'Internally, dates and date/times are stored as the number of seconds 
from October 14, 1582, and times are stored as the number of seconds 
from midnight.'

I think the SPSS 'Programming and Data Management' is not very clear:
"times are stored as the number of seconds from midnight", but which midnight?
13th or 14th of October one? I think about 14th , so 
as.Date(as.POSIXct(z, origin="1582-10-14", tz="GMT"))
has to be changed to
as.Date(as.POSIXct(z, origin="1582-10-15", tz="GMT"))


Test:
-----
Let's create a vector of dates in SPSS and save it in C:\\date.sav

DATA LIST / mydate (date).
BEGIN DATA.
01/01/1960
11/07/1955
25/11/1962
08/06/1959
28-01-2003
15,03,03
1/1/1997
01-JAN-1998
END DATA.
save outfile = "C:\\date.sav" /compressed.



Now we use R:

library(foreign)
test.df <- read.spss("C://date.sav", to.data.frame=T)

now following ?as.POSIXct

test.df$newdate <- as.Date(as.POSIXct(test.df$MYDATE , origin="1582-10-14"))

But if you take a look at the vector test.df$newdate you got:

R.date = SPSS.date - 1 day

Please confirm this! I would like to be sure that i didn't made mistakes (I've
used SPSS 11 + R 2.8)


If you come back to data, changing 14 to 15

test.df$newdate2 <- as.Date(as.POSIXct(test.df$MYDATE, origin="1582-10-15"))

assures that

R.date = SPSS.date


> test.df
       MYDATE    newdate   newdate2
1 11903760000 1959-12-31 1960-01-01
2 11762496000 1955-07-10 1955-07-11
3 11995257600 1962-11-24 1962-11-25
4 11885875200 1959-06-07 1959-06-08
5 13263091200 2003-01-27 2003-01-28
6 13267065600 2003-03-14 2003-03-15
7 13071456000 1996-12-31 1997-01-01
8 13102992000 1997-12-31 1998-01-01


You got that, do you?

It is needed to (eventually) ask clarifications
about SPSS to Raynald Levesque (the author of SPSS 'Programming and 
Data Management' guide) and to submit a bug for ?as.POSIXct

thank you
  Luca




More information about the R-help mailing list