[R] Problem with ddply in the plyr-package: surprising output of a date-column
Brian Diggs
diggsb at ohsu.edu
Mon Apr 25 23:14:19 CEST 2011
On 4/25/2011 1:07 PM, Hadley Wickham wrote:
>> If you need plyr for other tasks you ought to use a different
>> class for your date data (or wait until plyr can deal with
>> POSIXlt objects).
>
> How do you get POSIXlt objects into a data frame?
>
>> df<- data.frame(x = as.POSIXlt(as.Date(c("2008-01-01"))))
>> str(df)
> 'data.frame': 1 obs. of 1 variable:
> $ x: POSIXct, format: "2008-01-01"
>
>> df<- data.frame(x = I(as.POSIXlt(as.Date(c("2008-01-01")))))
>> str(df)
> 'data.frame': 1 obs. of 1 variable:
> $ x: AsIs, format: "0"
>
> Hadley
Assigning to a column after the data.frame creation step
> df <- data.frame(x = as.POSIXlt(as.Date(c("2008-01-01"))))
> str(df)
'data.frame': 1 obs. of 1 variable:
$ x: POSIXct, format: "2008-01-01"
> dput(df)
structure(list(x = structure(1199145600, class = c("POSIXct",
"POSIXt"), tzone = "UTC")), .Names = "x", row.names = c(NA, -1L
), class = "data.frame")
> df$x <- as.POSIXlt(as.Date(c("2008-01-01")))
> str(df)
'data.frame': 1 obs. of 1 variable:
$ x: POSIXlt, format: "2008-01-01"
> dput(df)
structure(list(x = structure(list(sec = 0, min = 0L, hour = 0L,
mday = 1L, mon = 0L, year = 108L, wday = 2L, yday = 0L, isdst =
0L), .Names = c("sec",
"min", "hour", "mday", "mon", "year", "wday", "yday", "isdst"
), class = c("POSIXlt", "POSIXt"), tzone = "UTC")), .Names = "x",
row.names = c(NA,
-1L), class = "data.frame")
This is reminiscent of the 1d array problem; there are types that are
coerced into other types when passed as part of a data.frame constructor
(data.frame call), but are not coerced when assigned to a column.
Looking at help pages, calls to data.frame call as.data.frame on each
argument; `[<-.data.frame` has a section on coercion which starts "The
story over when replacement values are coerced is a complicated one, and
one that has changed during R's development. This section is a guide
only." which makes me think it is not all that well defined.
Digging more, there is a as.data.frame.POSIXlt, although the help page
for it (DateTimeClasses in base) does not mention it or document it. It
is documented, though, in as.data.frame (which also has comments about
coercing 1 dimensional arrays).
So, potentially, there could be differences with any class that has an
as.data.frame method because it will be treated differently if passed to
data.frame versus a column assignment with `[<-.data.frame`
> methods("as.data.frame")
[1] as.data.frame.aovproj* as.data.frame.array
[3] as.data.frame.AsIs as.data.frame.character
[5] as.data.frame.complex as.data.frame.data.frame
[7] as.data.frame.Date as.data.frame.default
[9] as.data.frame.difftime as.data.frame.factor
[11] as.data.frame.ftable* as.data.frame.function
[13] as.data.frame.idf* as.data.frame.integer
[15] as.data.frame.list as.data.frame.logical
[17] as.data.frame.logLik* as.data.frame.matrix
[19] as.data.frame.model.matrix as.data.frame.numeric
[21] as.data.frame.numeric_version as.data.frame.ordered
[23] as.data.frame.POSIXct as.data.frame.POSIXlt
[25] as.data.frame.raw as.data.frame.table
[27] as.data.frame.ts as.data.frame.vector
So, I suppose it is working as documented. Though I wonder how long ago
it was that someone (who has been using R regularly for at least a year)
actually read the entire help page for data.frame and/or as.data.frame.
It's one of those things you think you know and understand until you
find out you don't.
--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
More information about the R-help
mailing list