[R] puzzled by math on date-time objects
Denis Chabot
chabotd at globetrotter.net
Wed Mar 11 13:29:07 CET 2009
Hi Phil,
Well thank you very much for this detailed explanation. It will help
me when summarizing information over periods of time using either
summarize (Hmisc) or summaryBy (doBy). Until now, doing so resulted in
"mean" time for each "group" being transformed as a number of seconds,
as you explain below. But both these functions do not put it back in a
POSIX date-time object. I tried to do so by using "as.POSIXct()" but
this failed because I did not provide a reference. From now on I'll
try the structure command you used below.
Denis
Le 09-03-10 à 19:04, Phil Spector a écrit :
> Denis -
> If you look inside of summary.POSIXct, you'll see the
> following:
>
> x <- summary.default(unclass(object), digits = digits, ...)[1:6]
>
> In other words, summary accepts the POSIX object, unclasses it
> (resulting in a numeric value representing the number of seconds
> since January 1, 1960), performs the operation, and then reassigns
> the class. You can do this basic trick yourself. Suppose we have a
> vector of dates and want the median:
>
>> dates =
>> as.POSIXct(c('2009-3-15','2009-2-19','2009-3-20','2009-2-18'))
>> median(dates)
> Error in Summary.POSIXct(c(1235030400, 1237100400), na.rm = FALSE) :
> 'sum' not defined for "POSIXt" objects
>> res = median(as.numeric(dates))
>> structure(res,class='POSIXct')
> [1] "2009-03-02 23:30:00 PST"
>
> I think it's clear that you can do any arithmetic operation on
> dates this way, even if it doesn't make sense:
>
>> sum(dates)
> Error in Summary.POSIXct(c(1237100400, 1235030400, 1237532400,
> 1234944000 :
> 'sum' not defined for "POSIXt" objects
>> res = sum(as.numeric(dates))
>> structure(res,class='POSIXct')
> [1] "2126-09-08 23:00:00 PDT"
>
> I'm quite certain that median.POSIXct will be fixed pretty quickly,
> but you can always unclass and reclass to do what you need.
>
> - Phil
>
>
>
>
>
>
> On Tue, 10 Mar 2009, Denis Chabot wrote:
>
>> Thanks Phil,
>>
>> but how does summary() finds the median of the same type of object?
>> I would have thought the algorithm used when the vector is even
>> would also require the SUM of the POSIX vector. I am glad of the
>> solution you propose, but still puzzled a bit!
>>
>> Denis
>> Le 09-03-10 à 12:39, Phil Spector a écrit :
>>
>>> Denis -
>>> There is no median method for POSIX objects, although
>>> there is a summary object. Thus, when you pass a POSIX
>>> object to median, it uses median.default, which contains
>>> the following code:
>>>
>>> if (n%%2L == 1L)
>>> sort(x, partial = half)[half]
>>> else sum(sort(x, partial = half + 0L:1L)[half + 0L:1L])/2
>>> So when the length of your POSIX vector is odd, it works, but if
>>> it's even, it would need to take the sum of a POSIX
>>> object. Of course, there is no sum method for POSIX objects,
>>> since it doesn't make sense.
>>> Right now, it looks like your best bet for a summary of POSIX
>>> objects is
>>> summary(a)['Median']
>>>
>>> - Phil Spector
>>> Statistical Computing Facility
>>> Department of Statistics
>>> UC Berkeley
>>> spector at stat.berkeley.edu
>>> On Tue, 10 Mar 2009, Denis Chabot wrote:
>>>> Hi,
>>>> I don't understand the following. When I create a small
>>>> artificial set of date information in class POSIXct, I can
>>>> calculate the mean and the median:
>>>> a = as.POSIXct(Sys.time())
>>>> a = a + 60*0:10; a
>>>> [1] "2009-03-10 11:30:16 EDT" "2009-03-10 11:31:16 EDT"
>>>> "2009-03-10 11:32:16 EDT"
>>>> [4] "2009-03-10 11:33:16 EDT" "2009-03-10 11:34:16 EDT"
>>>> "2009-03-10 11:35:16 EDT"
>>>> [7] "2009-03-10 11:36:16 EDT" "2009-03-10 11:37:16 EDT"
>>>> "2009-03-10 11:38:16 EDT"
>>>> [10] "2009-03-10 11:39:16 EDT" "2009-03-10 11:40:16 EDT"
>>>> median(a)
>>>> [1] "2009-03-10 11:35:16 EDT"
>>>> mean(a)
>>>> [1] "2009-03-10 11:35:16 EDT"
>>>> But for real data (for this post, a short subset is in object c)
>>>> that I have converted into a POSIXct object, I cannot calculate
>>>> the median with median(), though I do get it with summary():
>>>> c
>>>> [1] "2009-02-24 14:51:18 EST" "2009-02-24 14:51:19 EST"
>>>> "2009-02-24 14:51:19 EST"
>>>> [4] "2009-02-24 14:51:20 EST" "2009-02-24 14:51:20 EST"
>>>> "2009-02-24 14:51:21 EST"
>>>> [7] "2009-02-24 14:51:21 EST" "2009-02-24 14:51:22 EST"
>>>> "2009-02-24 14:51:22 EST"
>>>> [10] "2009-02-24 14:51:22 EST"
>>>> class(c)
>>>> [1] "POSIXt" "POSIXct"
>>>> median(c)
>>>> Erreur dans Summary.POSIXct(c(1235505080.6, 1235505081.1), na.rm
>>>> = FALSE) :
>>>> 'sum' not defined for "POSIXt" objects
>>>> One difference is that in my own date-time series, some events
>>>> are repeated (the original data contained fractions of seconds).
>>>> But then, why can I get a median through summary()?
>>>> summary(c)
>>>> Min. 1st Qu.
>>>> Median
>>>> "2009-02-24 14:51:18 EST" "2009-02-24 14:51:19 EST" "2009-02-24
>>>> 14:51:20 EST"
>>>> Mean 3rd
>>>> Qu. Max.
>>>> "2009-02-24 14:51:20 EST" "2009-02-24 14:51:21 EST" "2009-02-24
>>>> 14:51:22 EST"
>>>> Thanks in advance,
>>>> Denis Chabot
>>>> sessionInfo()
>>>> R version 2.8.1 Patched (2009-01-19 r47650)
>>>> i386-apple-darwin9.6.0
>>>> locale:
>>>> fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods
>>>> base
>>>> other attached packages:
>>>> [1] doBy_3.7 chron_2.3-30
>>>> loaded via a namespace (and not attached):
>>>> [1] Hmisc_3.5-2 cluster_1.11.12 grid_2.8.1
>>>> lattice_0.17-20 tools_2.8.1
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list