Let's tackle the bigger problem of doing this not just for time = 3 but for all times.
First we start with your data frame: > dat id time x 1 1 1 10 2 1 2 11 3 1 3 23 4 1 4 23 5 2 2 12 6 2 3 13 7 2 4 14 8 3 1 11 9 3 3 15 10 3 4 18 11 3 5 21 12 4 2 22 13 4 3 27 14 4 6 29 > ### Now put the data into an id x time matrix, with gaps: > mat <- with(dat, { + lev_id <- sort(unique(id)) + lev_tm <- sort(unique(time)) + M <- matrix(NA, length(lev_id), length(lev_tm)) + dimnames(M) <- list(id = lev_id, time = lev_tm) + M[cbind(match(id, lev_id), match(time, lev_tm))] <- x + M + }) > mat time id 1 2 3 4 5 6 1 10 11 23 23 NA NA 2 NA 12 13 14 NA NA 3 11 NA 15 18 21 NA 4 NA 22 27 NA NA 29 > ### Now do the replacements ### (this is a very questionable dodge, by the way!) > for(i in 2:nrow(mat)) + if(any(k <- is.na(mat[i-1, ]))) + mat[i-1, k] <- mat[i, k] > > mat time id 1 2 3 4 5 6 1 10 11 23 23 NA NA 2 11 12 13 14 21 NA 3 11 22 15 18 21 29 4 NA 22 27 NA NA 29 > ### some gaps cannot be filled. ### now turn it back into a data frame; ### this is a splendid trick that no one knows about: > dat <- as.data.frame(as.table(mat), responseName = "x") > dat <- with(dat, dat[order(id, time), ]) > ### this will look OK, but the first two columns are factors ### as we started with numeric variables (if we did) then ### it might be useful to turn them back to numerical variables ### again: > dat <- within(dat, { + id <- as.numeric(as.character(id)) + time <- as.numeric(as.character(time)) + }) > > dat id time x 1 1 1 10 5 1 2 11 9 1 3 23 13 1 4 23 17 1 5 NA 21 1 6 NA 2 2 1 11 6 2 2 12 10 2 3 13 14 2 4 14 18 2 5 21 22 2 6 NA 3 3 1 11 7 3 2 22 11 3 3 15 15 3 4 18 19 3 5 21 23 3 6 29 4 4 1 NA 8 4 2 22 12 4 3 27 16 4 4 NA 20 4 5 NA 24 4 6 29 > ### As many gaps have been filled as can be filled (with fake data!). ### If you want to remove those still missing, you can use > dat <- na.omit(dat) Bill Venables http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of gallon li Sent: Sunday, 29 March 2009 6:51 PM To: r-help Subject: [R] select observations from longitudinal data Suppose I have a long format for a longitudinal data id time x 1 1 10 1 2 11 1 3 23 1 4 23 2 2 12 2 3 13 2 4 14 3 1 11 3 3 15 3 4 18 3 5 21 4 2 22 4 3 27 4 6 29 I want to select the x values for each ID when time is equal to 3. When that observation is not observed, then I want to replace it with the obervation at time equal to 4. otherwise just use NA. How can I implement this with a quick command? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.