On Mar 10, 2010, at 10:30 AM, arnaud chozo wrote:
Hi,
I've a beginner question. I'm trying to extract data in my dataframe
according to some nested rules.
I have something like the dataframe test.df:
test.df = data.frame(V1=c(rep("A",10), rep("B",10), rep("C",5)),
V2=c(rep(1,5), rep(2,5), rep(1,5), rep(2,5), rep(1,5)))
V1 V2
1 A 1
2 A 1
3 A 1
4 A 1
5 A 1
6 A 2
7 A 2
8 A 2
9 A 2
10 A 2
11 B 1
12 B 1
13 B 1
14 B 1
15 B 1
16 B 2
17 B 2
18 B 2
19 B 2
20 B 2
21 C 1
22 C 1
23 C 1
24 C 1
25 C 1
For each value of the variable V1 (group A, B or C), I want to
extract rows
for which V2 is the max for the group in V1, in order to get:
V1 V2
1 A 2
2 A 2
3 A 2
4 A 2
5 A 2
6 B 2
7 B 2
8 B 2
9 B 2
10 B 2
11 C 1
12 C 1
13 C 1
14 C 1
15 C 1
> test.df[test.df$V2 == ave(test.df$V2, test.df$V1, FUN=max), ]
V1 V2
6 A 2
7 A 2
8 A 2
9 A 2
10 A 2
16 B 2
17 B 2
18 B 2
19 B 2
20 B 2
21 C 1
22 C 1
23 C 1
24 C 1
25 C 1
You get a bit of extra information in the form of the row numbers
which were extracted. If you want to get rid of that information, it
would not be difficult.
--
David.
I wrote this function:
mytest = function(df) {
myS = unique(df$V1)
df.tmp = subset(df, df$V1==myS[[1]])
df.sub = subset(df.tmp, df.tmp$V2==max(df.tmp$V2))
for (i in 2:length(myS)) {
df.tmp = subset(df, df$V1==myS[[i]])
df.sub = merge(df.sub, subset(df.tmp, df.tmp$V2==max(df.tmp$V2)),
all=TRUE)
}
df.sub
}
but need some more efficient and more general. Any idea?
Thanks in advance,
Arnaud
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.