[Rd] Small performance bug in [.Date
Martin Maechler
maechler at stat.math.ethz.ch
Tue Nov 21 11:39:00 CET 2017
>>>>> Hadley Wickham <h.wickham at gmail.com>
>>>>> on Mon, 20 Nov 2017 12:50:24 -0600 writes:
> Hi all,
> I think there's an unnecessary line in [.Date which has a considerable
> impact on performance when subsetting large dates:
> x <- Sys.Date() + 1:1e6
> microbenchmark::microbenchmark(x[1])
> #> Unit: microseconds
> #> expr min lq mean median uq max neval
> #> x[1] 920.651 1039.346 3624.833 2294.404 3786.881 41176.38 100
> `[.Date` <- function(x, ..., drop = TRUE) {
> cl <- oldClass(x)
> # class(x) <- NULL
> val <- NextMethod("[")
> class(val) <- cl
> val
> }
> microbenchmark::microbenchmark(x[1])
> #> Unit: microseconds
> #> expr min lq mean median uq max neval
> #> x[1] 2.738 3.0225 28.40893 3.269 3.513 2470.068 100
> Setting the class of x to NULL is problematic because it forces a
> copy, and I'm pretty sure it's unnecessary as NextMethod() does not
> consult the class of x, but instead uses .Class.
Yes, at least so it looks in src/main/objects.c
Also, we had a very similar change a while ago :
------------------------------------------------------------------------
r65926 | luke | 2014-06-12 15:54:38 +0200 (Thu, 12. Jun 2014) | 2 Zeilen
Geänderte Pfade:
M src/library/base/R/datetime.R
Commented out class(x) <- NULL in [.POSIXct and [[.POSICct.
------------------------------------------------------------------------
and we never seemed to have followed up in a systematic manner
finding other places where this happens and could be
eliminated. I see about half a dozen examples in
base/R/dates.R alone and am trying to find more in other places.
[maybe this used to be necessary for very early different
versions of NextMethod() which were not yet optimized using .Class etc]
Thank you very much,
Hadley!
Martin
More information about the R-devel
mailing list