Hello, I have a large data frame (1006222 rows), which I subject to a crude clustering attempt that results in a vector stating whether the datapoint represented by a row belongs to a cluster or not. Conceptually this looks something like this: Value Cluster? 0.01 FALSE 0.03 TRUE 0.04 TRUE 0.05 TRUE 0.07 FALSE ... What I'm looking for is an efficient strategy to extract all consecutive rows associated with "TRUE" as a single cluster (data.frame representation?) without cluttering memory with thousends of data.frames. I was thinking of an independent data.frame that would contain a column of lists that reference all indexes from the big one which are contained in one cluster ... Can anyone kindly nudge me and let me know how to deal with this efficiently?
Joh ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.