> -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of skan > Sent: Friday, September 10, 2010 12:33 PM > To: r-help@r-project.org > Subject: Re: [R] adding zeroes after old zeroes in a vector ?? > > > Hi > > I'll study your answers. > > I could also try > gsub("01", "00", x) N times > but it could be very slow if N is large > In fact when I wrote 111110011 I mean a vector > 1 > 1 > 1 > 1 > 1 > 0 > 0 > 1 > 1 > not a string, but I wrote it more compactly. > > I also could by shifting the elements of the vector one > position and ANDing > the result with the original. And again shifting 2 postions > and so on up to > N. But it's very slow.
How did you do the shifting (show code!) and how slow is slow? What is a typical length for the vector, what is a typical value of N, and what is a typical number of 0-runs in the vector? No code will be fastest over that whole parameter space. E.g., the following might run out of memory if the product of the number of runs and N is too big, but seems pretty quick for moderate N: f1 <- function(x, N) { nx <- length(x) indexOfOneAfterZero <- which(c(FALSE, x[-1]==1 & x[-nx]==0)) indexToZero <- outer(indexOfOneAfterZero, seq_len(N)-1, "+") indexToZero <- indexToZero[indexToZero<=nx] x[indexToZero] <- 0 x } E.g., for a vector with lots of short runs we get: > system.time(f1(sample(c(0,1),replace=TRUE,size=1e6), N=10)) user system elapsed 0.87 0.08 0.94 > system.time(f1(sample(c(0,1),replace=TRUE,size=1e6), N=100)) user system elapsed 3.75 0.80 4.32 > system.time(f1(sample(c(0,1),replace=TRUE,size=1e6), N=1000)) Error: cannot allocate vector of size 953.8 Mb Timing stopped at: 0.24 0.03 0.26 You can make one that is a tad slower but works for the bigger #runs * N: f2 <- function (x, N) { nx <- length(x) isOneAfterZero <- c(FALSE, x[-1] == 1 & x[-nx] == 0) for (i in seq_len(N)) { x[isOneAfterZero] <- 0 isOneAfterZero <- c(FALSE, isOneAfterZero[-length(isOneAfterZero)]) } x } > system.time(f2(x, N=10)) user system elapsed 0.58 0.03 0.59 > system.time(f2(x, N=100)) user system elapsed 5.08 0.86 5.54 > system.time(f2(x, N=1000)) user system elapsed 49.59 7.84 54.56 These have very different times when there are few runs and big N: > y <- as.integer(sin(seq(0,50,len=1e6))>0) > system.time(f1(y, N=1000)) user system elapsed 0.13 0.07 0.22 > system.time(f2(y, N=1000)) user system elapsed 39.66 7.36 46.78 My basic point is that when you ask for the fastest solution you have to describe your problem better. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -- > View this message in context: > http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-i n-a-vector-tp2534824p2534982.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.