Hi, I've a beginner question. I'm trying to extract data in my dataframe according to some nested rules.
I have something like the dataframe test.df: test.df = data.frame(V1=c(rep("A",10), rep("B",10), rep("C",5)), V2=c(rep(1,5), rep(2,5), rep(1,5), rep(2,5), rep(1,5))) V1 V2 1 A 1 2 A 1 3 A 1 4 A 1 5 A 1 6 A 2 7 A 2 8 A 2 9 A 2 10 A 2 11 B 1 12 B 1 13 B 1 14 B 1 15 B 1 16 B 2 17 B 2 18 B 2 19 B 2 20 B 2 21 C 1 22 C 1 23 C 1 24 C 1 25 C 1 For each value of the variable V1 (group A, B or C), I want to extract rows for which V2 is the max for the group in V1, in order to get: V1 V2 1 A 2 2 A 2 3 A 2 4 A 2 5 A 2 6 B 2 7 B 2 8 B 2 9 B 2 10 B 2 11 C 1 12 C 1 13 C 1 14 C 1 15 C 1 I wrote this function: mytest = function(df) { myS = unique(df$V1) df.tmp = subset(df, df$V1==myS[[1]]) df.sub = subset(df.tmp, df.tmp$V2==max(df.tmp$V2)) for (i in 2:length(myS)) { df.tmp = subset(df, df$V1==myS[[i]]) df.sub = merge(df.sub, subset(df.tmp, df.tmp$V2==max(df.tmp$V2)), all=TRUE) } df.sub } but need some more efficient and more general. Any idea? Thanks in advance, Arnaud [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.