[Rd] %in% very slow for Date class since R 4.3

Kurt Hornik Kurt@Horn|k @end|ng |rom wu@@c@@t
Wed Jun 25 07:11:40 CEST 2025


>>>>> hormutz screed writes:

Thanks.  Makes sense to me, needs some discussion in R Core ...

Best
-k

> I recently became aware that using %in% for the Date class is about
> 100x slower from R 4.3 onward than in older versions.  I did not
> include the results from R prior to 4.3 but the first and second
> methods below yield equal and very fast results for older R versions.

> I have suggested a fix that treats the date class in an identical
> manner to POSIXct and POSIXlt via the mtfrm generic which is
> ultimately called by %in%.  I only found one reference to this issue
> (see https://stackoverflow.com/questions/77909868/why-is-match-slower-on-dates-datetimes-in-r-version-4-3-2-than-version-4-2-2).

> I apologize if this should have been sent to r-help using R-project.org or
> if this issue has already been addressed.  Thanks.

> ------------------------------------------------------------------------------------------------------------
> Rstudio session below, note that R --vanilla gives the same results
> ------------------------------------------------------------------------------------------------------------
>> sessionInfo()$R.version$version.string    #
> [1] "R version 4.5.1 (2025-06-13)"
>> 
>> date_seq <- seq(as.Date("1705-01-01"), as.Date("2024-12-31"), by="days")
>> dt1 <- as.Date("2024-05-01")
>> 
>> # %in%
>> tictoc::tic()
>> tmp <- dt1 %in% date_seq
>> tictoc::toc()
> 0.125 sec elapsed
>> 
>> # cast to integer then %in% (gives fast results similar to old R without casting to int)
>> tictoc::tic()
>> tmp <- as.integer(dt1) %in% as.integer(date_seq)
>> tictoc::toc()
> 0.001 sec elapsed
>> 
>> # Create an mtfrm method for Date class that is identical to POSIXct and POSIXlt methods
>> # This results in the expected dramatic speedup
>> temp_fun <- function(x)
> +   as.vector(x, "any")
>> 
>> .S3method("mtfrm", "Date", temp_fun)
>> 
>> # %in% with mtrfm method for Date
>> tictoc::tic()
>> tmp <- dt1 %in% date_seq
>> tictoc::toc()
> 0.002 sec elapsed

> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list