Hi, I have a dataset in which I would like to select rows based on matching conditions and return the maximum value of a variable else return one row if duplicate counts exist. My dataset looks like this: PGID PTID Year Visit Count 6755 53121 2009 1 0 6755 53121 2009 2 0 6755 53121 2009 3 0 6755 53122 2008 1 0 6755 53122 2008 2 0 6755 53122 2008 3 1 6755 53122 2009 1 0 6755 53122 2009 2 1 6755 53122 2009 3 2
I would like to select rows if PTID and Year match and return the maximum count else return one row if counts are the same, such that I get this output PGID PTID Year Visit Count 6755 53121 2009 1 0 6755 53122 2008 3 1 6755 53122 2009 3 2 I tried the following code and the output is almost correct but duplicate values were included df2<-with(df, sapply(split(df, list(PTID, Year)), function(x) if (nrow(x)) x[which(x$Count==max(x$Count)),])) df<-do.call(rbind,df) rownames(df)<-1:nrow(df) Any suggestions? Thanks much for your responses! -- View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-matching-conditions-and-logical-operators-tp4637809.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.