On Apr 3, 2015, at 5:17 AM, Morway, Eric wrote: > This small example will be applied to a problem with 1.4e6 lines of data. > First, here is the dataset and a few lines of R script, followed by an > explanation of what I'd like to get: > > dat <- read.table(textConnection("ISEG IRCH val > 1 1 265 > 1 2 260 > 1 3 234 > 54 39 467 > 54 40 468 > 54 41 460 > 54 42 489 > 1 1 265 > 1 2 276 > 1 3 217 > 54 39 456 > 54 40 507 > 54 41 483 > 54 42 457 > 1 1 265 > 1 2 287 > 1 3 224 > 54 39 473 > 54 40 502 > 54 41 497 > 54 42 447 > 1 1 230 > 1 2 251 > 1 3 199 > 54 39 439 > 54 40 474 > 54 41 477 > 54 42 413 > 1 1 230 > 1 2 262 > 1 3 217 > 54 39 455 > 54 40 493 > 54 41 489 > 54 42 431 > 1 1 1002 > 1 2 1222 > 1 3 1198 > 54 39 1876 > 54 40 1565 > 54 41 1455 > 54 42 1427 > 1 1 1002 > 1 2 1246 > 1 3 1153 > 54 39 1813 > 54 40 1490 > 54 41 1518 > 54 42 1486 > 1 1 1002 > 1 2 1229 > 1 3 1142 > 54 39 1797 > 54 40 1517 > 54 41 1527 > 54 42 1514"),header=TRUE) > > dat$seq <- ifelse(dat$ISEG==1 & dat$IRCH==1, 1, 0) > tmp <- diff(dat[dat$seq==1,]$val)!=0 > dat$idx <- 0 > dat[dat$seq==1,][c(TRUE,tmp),]$idx <- 1 > dat$ts <- cumsum(dat$idx) > > At this point, I'd like to add one more column called "iter" that counts up > by 1 based on "seq", but within each "ts". So, the result would look like > this (undoubtedly this is a simple problem with something like ddply, but > I've been unable to construct the R for it):
> dat$iter2 <- ave(dat$seq, dat$ts,FUN=cumsum) > dat ISEG IRCH val seq idx ts iter iter2 1 1 1 265 1 1 1 1_1 1 2 1 2 260 0 0 1 1_1 1 3 1 3 234 0 0 1 1_1 1 4 54 39 467 0 0 1 1_1 1 5 54 40 468 0 0 1 1_1 1 6 54 41 460 0 0 1 1_1 1 7 54 42 489 0 0 1 1_1 1 8 1 1 265 1 0 1 1_2 2 9 1 2 276 0 0 1 1_2 2 10 1 3 217 0 0 1 1_2 2 11 54 39 456 0 0 1 1_2 2 12 54 40 507 0 0 1 1_2 2 13 54 41 483 0 0 1 1_2 2 14 54 42 457 0 0 1 1_2 2 15 1 1 265 1 0 1 1_3 3 16 1 2 287 0 0 1 1_3 3 17 1 3 224 0 0 1 1_3 3 18 54 39 473 0 0 1 1_3 3 19 54 40 502 0 0 1 1_3 3 20 54 41 497 0 0 1 1_3 3 21 54 42 447 0 0 1 1_3 3 22 1 1 230 1 1 2 2_4 1 23 1 2 251 0 0 2 2_4 1 snipped-----> -- David > > dat > ISEG IRCH val seq idx ts iter > 1 1 265 1 1 1 1 > 1 2 260 0 0 1 1 > 1 3 234 0 0 1 1 > 54 39 467 0 0 1 1 > 54 40 468 0 0 1 1 > 54 41 460 0 0 1 1 > 54 42 489 0 0 1 1 > 1 1 265 1 0 1 2 > 1 2 276 0 0 1 2 > 1 3 217 0 0 1 2 > 54 39 456 0 0 1 2 > 54 40 507 0 0 1 2 > 54 41 483 0 0 1 2 > 54 42 457 0 0 1 2 > 1 1 265 1 0 1 3 > 1 2 287 0 0 1 3 > 1 3 224 0 0 1 3 > 54 39 473 0 0 1 3 > 54 40 502 0 0 1 3 > 54 41 497 0 0 1 3 > 54 42 447 0 0 1 3 > 1 1 230 1 1 2 1 > 1 2 251 0 0 2 1 > 1 3 199 0 0 2 1 > 54 39 439 0 0 2 1 > 54 40 474 0 0 2 1 > 54 41 477 0 0 2 1 > 54 42 413 0 0 2 1 > 1 1 230 1 0 2 2 > 1 2 262 0 0 2 2 > 1 3 217 0 0 2 2 > 54 39 455 0 0 2 2 > 54 40 493 0 0 2 2 > 54 41 489 0 0 2 2 > 54 42 431 0 0 2 2 > 1 1 1002 1 1 3 1 > 1 2 1222 0 0 3 1 > 1 3 1198 0 0 3 1 > 54 39 1876 0 0 3 1 > 54 40 1565 0 0 3 1 > 54 41 1455 0 0 3 1 > 54 42 1427 0 0 3 1 > 1 1 1002 1 0 3 2 > 1 2 1246 0 0 3 2 > 1 3 1153 0 0 3 2 > 54 39 1813 0 0 3 2 > 54 40 1490 0 0 3 2 > 54 41 1518 0 0 3 2 > 54 42 1486 0 0 3 2 > 1 1 1002 1 0 3 3 > 1 2 1229 0 0 3 3 > 1 3 1142 0 0 3 3 > 54 39 1797 0 0 3 3 > 54 40 1517 0 0 3 3 > 54 41 1527 0 0 3 3 > 54 42 1514 0 0 3 3 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.