Hi, I cannot find a 'vectorized' solution to this 'for loop' kind of problem. Do you see a vectorized, fast-running solution?
Objective: Take the value of X at each timepoint and calculate the corresponding value of Y. Leading 0's and all 1's for X are assigned to Y; otherwise Y is incremented by the number of 0's adjacent to the last 1. The frequency and distribution of X vary widely and may have ~100 repeated 0's or 1's in a vector of 10k timepoints. Example: time 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X 0 1 0 1 0 1 0 0 1 1 1 0 0 0 . . Y 0 1 2 1 2 1 2 3 1 1 1 2 3 4 . . What I have done: My for() and apply()-related standard solutions are too slow. They are 6 times slower than my prototype, vectorized code which uses cumsum(). However(!)... my results are inaccurate and I can't correct them without introducing a for()! Here is my shot at a vectorized solution, as far as I can take it. Preliminary Vectorized Code: X <- matrix(sample(c(1,0,0,0,0), 500, replace = TRUE), 25, 20, byrow=TRUE) colnames(X) <- c(paste("a", 1:20, sep="")) noX <- X; noX[X!=0] <- 0; cumX <- noX; cumNoX <- noX; Y1 <- noX; Y2 <- X; Y3 <- X for (e in 1:ncol(X)) { cumX[,e] <- cumsum(X[,e]) noX[X[,e] < 1 & cumsum(X[,e]) > 0 ,e] <- 1 cumNoX[,e] <- cumsum(noX[,e]) } Y1[cumNoX > 0] <- cumNoX[cumNoX > 0] + 1 Y2[X == 0 & noX > 0] <- Y1[X == 0 & noX > 0] Y3 <- Y2 Y3[cumX > 1 & noX > 0] <- Y2[cumX > 1 & noX > 0] - cumX[cumX > 1 & noX > 0] X; Y3 Your help would be greatly appreciated! I'm stuck. Thank you, Tom Johnson ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.