[Rd] Long execution time for quantile() and difftime objects (PR#14091)
hong.ooi at anz.com
hong.ooi at anz.com
Fri Nov 27 06:55:10 CET 2009
Full_Name: Hong Ooi
Version: 2.10.0
OS: Windows XP
Submission from: (NULL) (203.110.235.1)
While trying to get summary statistics on a duration variable (the difference
between a start and end date), I ran into the following issue. Using summary or
quantile (which summary calls) on a difftime object takes an extremely long time
if the object is even moderately large.
A reproducible example:
> x <- as.Date(1:10000, origin="1900-01-01")
> x[1:10]
[1] "1900-01-02" "1900-01-03" "1900-01-04" "1900-01-05" "1900-01-06"
[6] "1900-01-07" "1900-01-08" "1900-01-09" "1900-01-10" "1900-01-11"
> d <- x - as.Date("1900-01-01")
> d[1:10]
Time differences in days
[1] 1 2 3 4 5 6 7 8 9 10
> system.time(summary(d[1:10]))
user system elapsed
0.01 0.00 0.01
> system.time(summary(d[1:100]))
user system elapsed
0.21 0.00 0.20
> system.time(summary(d[1:1000]))
user system elapsed
3.02 0.00 3.02
> system.time(summary(d[1:10000]))
user system elapsed
43.56 0.04 43.66
If I unclass d, there is no problem:
> system.time(summary(unclass(d[1:10000])))
user system elapsed
0 0 0
Testing with Rprof() indicates that the problem lies in [.difftime, although the
code for that function seems innocuous enough.
> sessionInfo()
R version 2.10.0 (2009-10-26)
i386-pc-mingw32
locale:
[1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
More information about the R-devel
mailing list