I recently became aware that using %in% for the Date class is about 100x slower from R 4.3 onward than in older versions. I did not include the results from R prior to 4.3 but the first and second methods below yield equal and very fast results for older R versions.
I have suggested a fix that treats the date class in an identical manner to POSIXct and POSIXlt via the mtfrm generic which is ultimately called by %in%. I only found one reference to this issue (see https://stackoverflow.com/questions/77909868/why-is-match-slower-on-dates-datetimes-in-r-version-4-3-2-than-version-4-2-2). I apologize if this should have been sent to r-h...@r-project.org or if this issue has already been addressed. Thanks. ------------------------------------------------------------------------------------------------------------ Rstudio session below, note that R --vanilla gives the same results ------------------------------------------------------------------------------------------------------------ > sessionInfo()$R.version$version.string # [1] "R version 4.5.1 (2025-06-13)" > > date_seq <- seq(as.Date("1705-01-01"), as.Date("2024-12-31"), by="days") > dt1 <- as.Date("2024-05-01") > > # %in% > tictoc::tic() > tmp <- dt1 %in% date_seq > tictoc::toc() 0.125 sec elapsed > > # cast to integer then %in% (gives fast results similar to old R without > casting to int) > tictoc::tic() > tmp <- as.integer(dt1) %in% as.integer(date_seq) > tictoc::toc() 0.001 sec elapsed > > # Create an mtfrm method for Date class that is identical to POSIXct and > POSIXlt methods > # This results in the expected dramatic speedup > temp_fun <- function(x) + as.vector(x, "any") > > .S3method("mtfrm", "Date", temp_fun) > > # %in% with mtrfm method for Date > tictoc::tic() > tmp <- dt1 %in% date_seq > tictoc::toc() 0.002 sec elapsed ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel