Hi,

I've a beginner question. I'm trying to extract data in my dataframe
according to some nested rules.

I have something like the dataframe test.df:

test.df = data.frame(V1=c(rep("A",10), rep("B",10), rep("C",5)),
V2=c(rep(1,5), rep(2,5), rep(1,5), rep(2,5), rep(1,5)))

   V1 V2
1   A  1
2   A  1
3   A  1
4   A  1
5   A  1
6   A  2
7   A  2
8   A  2
9   A  2
10  A  2
11  B  1
12  B  1
13  B  1
14  B  1
15  B  1
16  B  2
17  B  2
18  B  2
19  B  2
20  B  2
21  C  1
22  C  1
23  C  1
24  C  1
25  C  1

For each value of the variable V1 (group A, B or C), I want to extract rows
for which V2 is the max for the group in V1, in order to get:

   V1 V2
1   A  2
2   A  2
3   A  2
4   A  2
5  A  2
6  B  2
7  B  2
8  B  2
9  B  2
10  B  2
11  C  1
12  C  1
13  C  1
14  C  1
15  C  1

I wrote this function:

mytest = function(df) {
  myS = unique(df$V1)
  df.tmp = subset(df, df$V1==myS[[1]])
  df.sub = subset(df.tmp, df.tmp$V2==max(df.tmp$V2))
  for (i in 2:length(myS)) {
    df.tmp = subset(df, df$V1==myS[[i]])
    df.sub = merge(df.sub, subset(df.tmp, df.tmp$V2==max(df.tmp$V2)),
all=TRUE)
  }
  df.sub
}

but need some more efficient and more general. Any idea?

Thanks in advance,
Arnaud

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to