[Rd] Proposed speedup of ifelse

2018-05-02 Thread Hugh Parsonage
I propose a patch to ifelse that leverages anyNA(test) to achieve an
improvement in performance. For a test vector of length 10, the change
nearly halves the time taken and for a test of length 1 million, there
is a tenfold increase in speed. Even for small vectors, the
distributions of timings between the old and the proposed ifelse do
not intersect.

The patch does not intend to change the behaviour of ifelse (i.e. it
is intended to be a drop-in replacement). However, the patch
inadvertently corrects what I believe to be a bug in the release
version of ifelse: the documentation says that attributes of test are
kept; however, this is not true unless test is atomic.

  library(Matrix)
  M <- Matrix(-10 + 1:28, 4, 7)
  ifelse(M, 1, 2)

The performance improvement does not rely on this, however; so if
current behaviour in these cases is intended, the patch can be
trivially amended to reflect this.

I've written up a short note detailing the performance improvements
and some unit tests at
https://hughparsonage.github.io/content/post/A-new-ifelse.html


Best

Hugh Parsonage
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Proposed speedup of ifelse

2018-05-02 Thread Hugh Parsonage
And the patch itself:


--- a/src/library/base/R/ifelse.R
+++ b/src/library/base/R/ifelse.R
@@ -18,12 +18,14 @@

 ifelse <- function (test, yes, no)
 {
+  attributes_of_test <- attributes(test)
+
   if(is.atomic(test)) { # do not lose attributes
 if (typeof(test) != "logical")
   storage.mode(test) <- "logical"
 ## quick return for cases where 'ifelse(a, x, y)' is used
 ## instead of 'if (a) x else y'
-if (length(test) == 1 && is.null(attributes(test))) {
+if (length(test) == 1 && is.null(attributes_of_test)) {
   if (is.na(test)) return(NA)
   else if (test) {
 if (length(yes) == 1) {
@@ -43,6 +45,62 @@ ifelse <- function (test, yes, no)
   }
   else ## typically a "class"; storage.mode<-() typically fails
 test <- if(isS4(test)) methods::as(test, "logical") else as.logical(test)
+
+  # Give up attempting backwards-compatibility under these conditions:
+  if (typeof(yes) %in% c("logical", "integer", "double", "character") &&
+  typeof(no)  %in% c("logical", "integer", "double", "character") &&
+  !is.factor(yes) &&
+  !is.factor(no) &&
+  length(no) != 0L &&
+  length(yes) != 0L) {
+if (length(no) == length(test)) {
+  out <- no
+} else if (length(no) == 1L) {
+  out <- rep_len(no, length(test))
+} else if (length(no) != 0L) {
+  out <- rep_len(no[1L], length(test))
+} else {
+  return(.ifelse(test, yes, no))
+}
+
+if (length(yes) != 1L && length(yes) != length(test)) {
+  return(.ifelse(test, yes, no))
+}
+
+
+if (anyNA(test)) {
+  # no benefit to saving the na results
+  Yes <- which(test)
+  out[is.na(test)] <- NA
+  if (length(yes) == 1L) {
+out[Yes] <- yes
+  } else if (length(yes) == length(test)) {
+out[Yes] <- yes[Yes]
+  } else {
+return(.ifelse(test, yes, no))
+  }
+} else {
+  # No NAs to deal with
+  if (length(yes) == 1L) {
+out[test] <- yes
+  } else if (length(yes) == length(test)) {
+wtest <- which(test) # faster than test directly
+out[wtest] <- yes[wtest]
+  } else {
+return(.ifelse(test, yes, no))
+  }
+}
+if (!is.null(attributes_of_test)) {
+  attributes(out) <- attributes_of_test
+}
+
+out
+  } else {
+return(.ifelse(test, yes, no))
+  }
+}
+
+.ifelse <- function(test, yes, no) {
   ans <- test
   ok <- !is.na(test)
   if (any(test[ok]))

On 3 May 2018 at 13:58, Hugh Parsonage  wrote:
> I propose a patch to ifelse that leverages anyNA(test) to achieve an
> improvement in performance. For a test vector of length 10, the change
> nearly halves the time taken and for a test of length 1 million, there
> is a tenfold increase in speed. Even for small vectors, the
> distributions of timings between the old and the proposed ifelse do
> not intersect.
>
> The patch does not intend to change the behaviour of ifelse (i.e. it
> is intended to be a drop-in replacement). However, the patch
> inadvertently corrects what I believe to be a bug in the release
> version of ifelse: the documentation says that attributes of test are
> kept; however, this is not true unless test is atomic.
>
>   library(Matrix)
>   M <- Matrix(-10 + 1:28, 4, 7)
>   ifelse(M, 1, 2)
>
> The performance improvement does not rely on this, however; so if
> current behaviour in these cases is intended, the patch can be
> trivially amended to reflect this.
>
> I've written up a short note detailing the performance improvements
> and some unit tests at
> https://hughparsonage.github.io/content/post/A-new-ifelse.html
>
>
> Best
>
> Hugh Parsonage

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel