In addition to Bill's method, you may also use: vec1 <- rep(c(1,2,3,4,5), c(10,30,24,65,3)) c(0,which(diff(vec2)!=0)) #or indx <- cumsum(rle(vec2)$lengths) c(0,indx[-length(indx)])
#Bill's method was found to be the fastest vec3 <- rep(vec1,1e4) system.time( res <- c(0,which(diff(vec3)!=0))) # user system elapsed # 0.124 0.000 0.125 system.time({ indx <- cumsum(rle(vec3)$lengths) res2 <- c(0,indx[-length(indx)])}) # user system elapsed # 0.112 0.000 0.112 system.time({ indx <- which(isLastInRun(vec3)) res3 <- c(0,indx[-length(indx)]) }) # user system elapsed # 0.088 0.000 0.086 system.time({indx <- cumsum(c(0,abs(diff(vec3)))) indx2 <- tapply(seq_along(indx),list(indx),FUN=max) res4 <- c(0,indx2[-length(indx2)]) }) # user system elapsed # 2.456 0.000 2.457 names(res4)<-NULL identical(res,res4) #[1] TRUE identical(res,res2) #[1] TRUE identical(res,res3) #[1] TRUE A.K. On Friday, October 18, 2013 8:31 PM, William Dunlap <wdun...@tibco.com> wrote: > I have a very long vector (length=1855190) it looks something like this > > 1111...2222...3333....etc so it would be something equivalent of doing: > rep(c(1,2,3,4,5), c(10,30,24,65,3)) > > How can I find the index of where the step/jump is? For example using the > above I would > get an index of 0, 10, 40, 64, 129 Define 2 functions: isFirstInRun <- function(x) c(TRUE, x[-1]!=x[-length(x)]) isLastInRun <- function(x) c(x[-1]!=x[-length(x)], TRUE) and use them as > z <- rep(c(1,2,3,4,5), c(10,30,24,65,3)) > which(isLastInRun(z)) [1] 10 40 64 129 132 > which(isFirstInRun(z)) [1] 1 11 41 65 130 (0 is not a valid R index into a vector, so I prefer one of the above results, but you can fiddle with the endpoints as you wish.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Benton, Paul > Sent: Friday, October 18, 2013 5:18 PM > To: r-help@r-project.org > Subject: [R] find jumps in vector of repeats > > Hello all, > > I'm not really sure how to search for this in google/Rseek so there is > probably a > command to do it. I also know I could write an apply loop to find it but > thought I would > ask all you lovely R gurus. > > I have a very long vector (length=1855190) it looks something like this > > 1111...2222...3333....etc so it would be something equivalent of doing: > rep(c(1,2,3,4,5), c(10,30,24,65,3)) > > How can I find the index of where the step/jump is? For example using the > above I would > get an index of 0, 10, 40, 64, 129 > > Any help would be greatly appreciated. > > Cheers, > > Paul > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.