On Thu, Jul 22, 2010 at 5:18 AM, Christian Schoder <schoc...@newschool.edu> wrote: > Dear R-user, > > a few weeks ago I consulted the list-serve with a similar question. > However, my task changed a little but sufficiently to get lost again. So > I would appreciate any help on the following issue. > > I use the plm package and work with firm-level data in a panel. I would > like to eliminate all firms that do not fulfill the requirement of > having an observation in every variable used for at least x consecutive > years. > > For illustration of the problem assume the following data set >> data > id year y z > 1 a 2000 1 1 > 2 b 2000 NA 2 > 3 b 2001 3 3 > 4 c 1999 1 1 > 5 c 2000 2 2 > 6 c 2001 4 NA > 7 c 2002 5 4 > 8 d 1998 6 5 > 9 d 1999 5 NA > 10 d 2000 6 6 > 11 d 2001 7 7 > 12 d 2002 3 6 > where id is the index of the firm, year the index for the year, and y > and z are variables. Now, I would like to get rid of all firms with, > let's say, less than 3 consecutive years in which there are observations > for every variable. Hence, the procedure should yield >> data.reduced > id year y z > 1 d 1998 6 5 > 2 d 1999 5 NA > 3 d 2000 6 6 > 4 d 2001 7 7 > 5 d 2002 3 6 >
Try this: do.call(rbind, by(DF, DF$id, function(x) if (length(na.contiguous(x$y * x$z)) >= 3) x )) ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.