Sam - I think that subset is what's throwing you off here -- you need a function that will simply return the 10 rows of each group with the highest values of x:
function(dat)dat[order(dat$x,decreasing=TRUE)[1:10],] Then ddply(df,'z',function(dat)dat[order(dat$x,decreasing=TRUE)[1:10],]) should give you what you want. In this simple case, you could also use do.call(rbind,by(df,df$z,function(dat)dat[order(dat$x,decreasing=TRUE)[1:10],])) from base R to get the same result. Hope this helps. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Fri, 27 Jan 2012, Sam Albers wrote:
Hello, I am looking for a way to subset a data frame by choosing the top ten maximum values from that dataframe. As well this occurs within some factor levels. ## I've used plyr here but I'm not married to this approach require(plyr) ## I've created a data.frame with two groups and then a id variable (y) df <- data.frame(x=rnorm(400, mean=20), y=1:400, z=c("A","B")) ## So using ddply I can find the highest value of x df.max1 <- ddply(df, c("z"), subset, x==sort(x, TRUE)[1]) ## Or the 2nd highest value df.max2 <- ddply(df, c("z"), subset, x==sort(x, TRUE)[2]) ## And so on.... but when I try to make a series of numbers like so ## to get the top ten values, I don't get a warning message but ## two values that don't really make sense to me df.max <- ddply(df, c("z"), subset, x==sort(x, TRUE)[1:10]) ## So no error message when I use the method above, which is clearly wrong. ## But I really am not sure how to diagnose the problem. ## Can anyone suggest a way to subset a data.frame with groups to select the top ten max values in that data.frame for each group? ## Thanks so much in advance? Sam ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.