Re: [R] Speeding up "accumulation" code in large matrix calc?

Petr Savicky Fri, 24 Feb 2012 10:51:03 -0800

On Fri, Feb 24, 2012 at 08:59:44AM -0800, robertfeldt wrote:
> Hi,
> 
> I have R code like so:
> 
> num.columns.back.since.last.occurence <- function(m, outcome) {
>       nrows <- dim(m)[1];
>       ncols <- dim(m)[2];
>       res <- matrix(rep.int(0, nrows*ncols), nrow=nrows);
>       for(row in 1:nrows) {
>               for(col in 2:ncols) {
>                       res[row,col] <- if(m[row,col-1]==outcome) {0} else 
> {1+res[row,col-1]}
>               }
>       }
>       res;
> }
> 
> but on the very large matrices I apply this the execution times are a
> problem. I would appreciate any help to rewrite this with more
> "standard"/native R functions to speed things up.


Hi.

If the number of columns is large, so the rows are long, then
the following can be more efficient.

  oneRow <- function(x, outcome)
  {
      n <- length(x)
      y <- c(0, cumsum(x[-n] == outcome))
      ave(x, y, FUN = function(z) seq.int(along=z) - 1)
  }

  # random matrix 
  A <- matrix((runif(49) < 0.2) + 0, nrow=7)

  # the required transformation
  B <- t(apply(A, 1, oneRow, outcome=1))

  # verify
  all(num.columns.back.since.last.occurence(A, 1) == B)

  [1] TRUE

This solution performs a loop over rows (in apply), so if the
number of rows is large and the number of columns is not,
then a solution, which uses a loop over columns, may be
better.

Hope this helps.

Petr Savicky.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Speeding up "accumulation" code in large matrix calc?

Reply via email to