Hi,

S4 method dispatch can be very slow. Would it be reasonable to cache the most recent dispatch, anticipating the next invocation will be on the same type? This
would be very helpful in loops.

  fun0 <- function(x)
      sapply(x, paste, collapse="+")
  fun1 <- function(x) {
      paste <- selectMethod(paste, class(x[[1]]))
      sapply(x, paste, collapse="+")
  }
  lst <- split(rep(LETTERS, 100), rep(1:1300, 2))

  library(microbenchmark)
  microbenchmark(fun0(lst), times=10)
  ## Unit: milliseconds
  ##       expr      min       lq   median      uq      max neval
  ##  fun0(lst) 4.153287 4.180659 4.513539 5.19261 5.280481    10

  setGeneric("paste")
  microbenchmark(fun0(lst), fun1(lst), times=10)
  ## >     microbenchmark(fun0(lst), fun1(lst), times=10)
  ## Unit: milliseconds
  ##       expr       min       lq    median        uq       max neval
  ##  fun0(lst) 21.093180 21.27616 21.453174 21.833686 24.758791    10
  ##  fun1(lst)  4.517808  4.53067  4.582641  4.682235  5.121856    10

Dispatch seems to be especially slow when packages are involved, e.g.,
with the Bioconductor IRanges package
(http://bioconductor.org/packages/release/bioc/html/IRanges.html)

  removeGeneric("paste")
  library(IRanges)
  showMethods(paste)
  ## Function: paste (package BiocGenerics)
  ## ...="ANY"
  ## ...="Rle"
  selectMethod(paste, "ANY")
  ## Method Definition (Class "derivedDefaultMethod"):
  ##
  ## function (..., sep = " ", collapse = NULL)
  ## .Internal(paste(list(...), sep, collapse))
  ## <environment: namespace:base>
  ##
  ## Signatures:
  ##         ...
  ## target  "ANY"
  ## defined "ANY"

  microbenchmark(fun0(lst), fun1(lst), times=10)
  ## Unit: milliseconds
## expr min lq median uq max neval ## fun0(lst) 233.539585 234.592491 236.311209 237.268506 243.181123 10 ## fun1(lst) 4.564914 4.592996 4.642898 4.729009 5.492706 10

  sessionInfo()
  ## R version 3.0.0 Patched (2013-04-04 r62492)
  ## Platform: x86_64-unknown-linux-gnu (64-bit)
  ##
  ## locale:
  ##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  ##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  ##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  ##  [7] LC_PAPER=C                 LC_NAME=C
  ##  [9] LC_ADDRESS=C               LC_TELEPHONE=C
  ## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
  ##
  ## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
  ## [8] base
  ##
  ## other attached packages:
  ## [1] IRanges_1.19.15      BiocGenerics_0.7.2   microbenchmark_1.3-0
  ##
  ## loaded via a namespace (and not attached):
  ## [1] stats4_3.0.0


Thanks,
Valerie

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to