I wouldn't say 'easy', and so I think we need a business case for this change. (One of the issues is that the internals are used elsewhere and optimized for inputs without NAs. So we would need to write separate code if we pass NAs down to C level. As I recall, is.unsorted was a cheap R interface to existing C code.)
What real-world problems are being affected by this, and what would be proportional speedup of the whole analysis be from this change? On Thu, 17 Apr 2008, Bill Dunlap wrote: > On Thu, 17 Apr 2008, Herve Pages wrote: > >> Couldn't is.unsorted() bail out immediately here (after comparing >> the first 2 elements): >> >> > x <- 20000000:1 >> > system.time(is.unsorted(x), gcFirst=TRUE) >> user system elapsed >> 0.084 0.040 0.124 >> >> > x <- 200000000:1 >> > system.time(is.unsorted(x), gcFirst=TRUE) >> user system elapsed >> 0.772 0.440 1.214 > > The C code does bail out upon seeing the first out- of-order pair, but > before calling the C code, the S code does any(is.na(x)), forcing a > scan of the entire data. If you remove the is.na calls from > is.unsorted's S code you will see the timings improve in your example. > (It looks easy to do the NA checks in the C code.) > > is.unsorted.no.nacheck <- function (x, na.rm = FALSE) { > if (is.null(x)) > return(FALSE) > if (!is.atomic(x)) > return(NA) > .Internal(is.unsorted(x)) > } > > x <- 20000000:1 > > system.time(is.unsorted(x), gcFirst=TRUE) > user system elapsed > 0.356 0.157 0.514 > > system.time(is.unsorted.no.nacheck(x), gcFirst=TRUE) > user system elapsed > 0 0 0 > > revx <- rev(x) > > system.time(is.unsorted(revx), gcFirst=TRUE) > user system elapsed > 0.500 0.170 0.672 > > system.time(is.unsorted.no.nacheck(revx),gcFirst=TRUE) > user system elapsed > 0.131 0.000 0.132 > > ---------------------------------------------------------------------------- > Bill Dunlap > Insightful Corporation > bill at insightful dot com > 360-428-8146 > > "All statements in this message represent the opinions of the author and do > not necessarily reflect Insightful Corporation policy or position." > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel