Hello r-devel, I have data.frame with 3 columns and I would like to group by 1 column(id), find the max of the third column (date) and return the data for that max date value along with the id and the value in the second column.
Example: >dat <- data.frame(id = rep(1:3, 3), date = as.Date(rep(c("2005-08-25", "2005-08-26", "2005-08-29"), each = 3)), decod = c("SCREEN", "SCREEN", "SCREEN", "RAND", "RAND", "RAND", "COMPLETE", "COMPLETE", "WITHDRAWAL") ) What I need is it to return is: id x.decod.1. end 1 1 COMPLETE 2005-08-29 2 2 COMPLETE 2005-08-29 3 3 WITHDRAWAL 2005-08-29 I can get the max date and the id 2 different ways: > do.call("rbind", lapply(split(dat, dat$id), function(x) data.frame(id = x$id[1], max_date = max(x$date)))) id end 1 1 2005-08-29 2 2 2005-08-29 3 3 2005-08-29 OR > aggregate(dat$date, list(USUBJID=dat$id),FUN="max") USUBJID x 1 1 13024 2 2 13024 3 3 13024 (which oddly returns some number of days after 1-1-1970 iso of the max as a date value) Id like to do this without looping or filtering for date and usubjid if possible. If there is a way to return the index from the max date function that I can then use to index the data.frame? I came across a function dapply which looks like it might work but unfortunately the package isnt one I can install in the near future due to some company restrictions. Any ideas would be appreciated, VL [[alternative HTML version deleted]]
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel