[Rd] %in% very slow for Date class since R 4.3
Kurt Hornik
Kurt@Horn|k @end|ng |rom wu@@c@@t
Wed Jun 25 07:11:40 CEST 2025
>>>>> hormutz screed writes:
Thanks. Makes sense to me, needs some discussion in R Core ...
Best
-k
> I recently became aware that using %in% for the Date class is about
> 100x slower from R 4.3 onward than in older versions. I did not
> include the results from R prior to 4.3 but the first and second
> methods below yield equal and very fast results for older R versions.
> I have suggested a fix that treats the date class in an identical
> manner to POSIXct and POSIXlt via the mtfrm generic which is
> ultimately called by %in%. I only found one reference to this issue
> (see https://stackoverflow.com/questions/77909868/why-is-match-slower-on-dates-datetimes-in-r-version-4-3-2-than-version-4-2-2).
> I apologize if this should have been sent to r-help using R-project.org or
> if this issue has already been addressed. Thanks.
> ------------------------------------------------------------------------------------------------------------
> Rstudio session below, note that R --vanilla gives the same results
> ------------------------------------------------------------------------------------------------------------
>> sessionInfo()$R.version$version.string #
> [1] "R version 4.5.1 (2025-06-13)"
>>
>> date_seq <- seq(as.Date("1705-01-01"), as.Date("2024-12-31"), by="days")
>> dt1 <- as.Date("2024-05-01")
>>
>> # %in%
>> tictoc::tic()
>> tmp <- dt1 %in% date_seq
>> tictoc::toc()
> 0.125 sec elapsed
>>
>> # cast to integer then %in% (gives fast results similar to old R without casting to int)
>> tictoc::tic()
>> tmp <- as.integer(dt1) %in% as.integer(date_seq)
>> tictoc::toc()
> 0.001 sec elapsed
>>
>> # Create an mtfrm method for Date class that is identical to POSIXct and POSIXlt methods
>> # This results in the expected dramatic speedup
>> temp_fun <- function(x)
> + as.vector(x, "any")
>>
>> .S3method("mtfrm", "Date", temp_fun)
>>
>> # %in% with mtrfm method for Date
>> tictoc::tic()
>> tmp <- dt1 %in% date_seq
>> tictoc::toc()
> 0.002 sec elapsed
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list