try this:

Dat <- read.table(textConnection(
"id year  y  z
1   a 2000  1  1
2   b 2000 NA  2
3   b 2001  3  3
4   c 1999  1  1
5   c 2000  2  2
6   c 2001  4 NA
7   c 2002  5  4
8   d 1998  6  5
9   d 1999  5 NA
10  d 2000  6  6
11  d 2001  7  7
12  d 2002  3  6"
), header = TRUE)
closeAllConnections()

n.years <- 3 # the threshold
na.ind <- !rowSums(is.na(Dat[-(1:2)])) # the variables of interest
ind <- ave(na.ind, Dat$id, FUN = function (x) any(cumsum(x) > n.years))
Dat[ind, ]


I hope it helps.

Best,
Dimitris


On 7/22/2010 11:18 AM, Christian Schoder wrote:
Dear R-user,

a few weeks ago I consulted the list-serve with a similar question.
However, my task changed a little but sufficiently to get lost again. So
I would appreciate any help on the following issue.

I use the plm package and work with firm-level data in a panel. I would
like to eliminate all firms that do not fulfill the requirement of
having an observation in every variable used for at least x consecutive
years.

For illustration of the problem assume the following data set
data
    id year  y  z
1   a 2000  1  1
2   b 2000 NA  2
3   b 2001  3  3
4   c 1999  1  1
5   c 2000  2  2
6   c 2001  4 NA
7   c 2002  5  4
8   d 1998  6  5
9   d 1999  5 NA
10  d 2000  6  6
11  d 2001  7  7
12  d 2002  3  6
where id is the index of the firm, year the index for the year, and y
and z are variables. Now, I would like to get rid of all firms with,
let's say, less than 3 consecutive years in which there are observations
for every variable. Hence, the procedure should yield
data.reduced
    id year  y  z
1   d 1998  6  5
2   d 1999  5 NA
3   d 2000  6  6
4   d 2001  7  7
5   d 2002  3  6

Thank you very much for any help!

Cheers, Christian

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to