On Fri, Feb 24, 2012 at 08:59:44AM -0800, robertfeldt wrote: > Hi, > > I have R code like so: > > num.columns.back.since.last.occurence <- function(m, outcome) { > nrows <- dim(m)[1]; > ncols <- dim(m)[2]; > res <- matrix(rep.int(0, nrows*ncols), nrow=nrows); > for(row in 1:nrows) { > for(col in 2:ncols) { > res[row,col] <- if(m[row,col-1]==outcome) {0} else > {1+res[row,col-1]} > } > } > res; > } > > but on the very large matrices I apply this the execution times are a > problem. I would appreciate any help to rewrite this with more > "standard"/native R functions to speed things up.
Hi. If the number of columns is large, so the rows are long, then the following can be more efficient. oneRow <- function(x, outcome) { n <- length(x) y <- c(0, cumsum(x[-n] == outcome)) ave(x, y, FUN = function(z) seq.int(along=z) - 1) } # random matrix A <- matrix((runif(49) < 0.2) + 0, nrow=7) # the required transformation B <- t(apply(A, 1, oneRow, outcome=1)) # verify all(num.columns.back.since.last.occurence(A, 1) == B) [1] TRUE This solution performs a loop over rows (in apply), so if the number of rows is large and the number of columns is not, then a solution, which uses a loop over columns, may be better. Hope this helps. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.