[R] Within ID variable delete all rows after reaching a specific value

Jennifer Sabatier Fri, 25 Apr 2014 19:45:13 -0700

So, I know that's a confusing Subject header.

Here's similar data:



tmp <- data.frame(matrix(
                        c(rbinom(1000, 1, .03),
                          array(1:127, c(1000,1)),
                          array(format(seq(ISOdate(1990,1,1), by='month',
length=56), format='%d.%m.%Y'), c(1000,1))),
                        ncol=3))
tmp <- tmp[with(tmp, order(X2, X3)), ]
table(tmp$X1)


X1 is the variable of interest - disease status.  It's a survival-type of
variable, where you are 0 until you become 1.
X2 is the person ID variable.
X3 is the clinic date (here it's monthly, just for example...but in my real
data it's a bit more complicated - definitely not equally spaced nor the
same number of visits to the clinic per ID.).

Some people stay X1 = 0 for all clinic visits.  Only a small proportion
become X1=1.

However, the data has errors I need to clean off.  Once someone becomes
X1=1 they should have no more rows in the dataset.  These are data entry
errors.

In my data I have people who continue to have rows in the data.  Sometimes
the rows show X1=0 and sometimes X1=1.  Sometimes there's just one more row
and sometimes there are many more rows.

How can I go through, find the first X1 = 1, and then delete any rows after
that, for each value of X2?

Thanks!

Jen

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Within ID variable delete all rows after reaching a specific value

Reply via email to