Hi,
I have a dataset in which I would like to select rows based on matching
conditions and return the maximum value of a variable else return one row if
duplicate counts exist.  My dataset looks like this:
PGID    PTID    Year     Visit  Count
6755    53121   2009    1       0
6755    53121   2009    2       0
6755    53121   2009    3       0
6755    53122   2008    1       0
6755    53122   2008    2       0
6755    53122   2008    3       1
6755    53122   2009    1       0
6755    53122   2009    2       1
6755    53122   2009    3       2

I would like to select rows if PTID and Year match and return the maximum
count else return one row if counts are the same, such that I get this
output 
PGID    PTID    Year     Visit  Count
6755    53121   2009    1       0
6755    53122   2008    3       1
6755    53122   2009    3       2

I tried the following code and the output is almost correct but duplicate
values were included
df2<-with(df, sapply(split(df, list(PTID, Year)),
function(x) if (nrow(x)) x[which(x$Count==max(x$Count)),]))
df<-do.call(rbind,df)
rownames(df)<-1:nrow(df)

Any suggestions? 
Thanks much for your responses!




--
View this message in context: 
http://r.789695.n4.nabble.com/Select-rows-based-on-matching-conditions-and-logical-operators-tp4637809.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to