All the solutions in this thread so far use the lapply(split(...)) paradigm
either directly or indirectly. That paradigm doesn't scale. That's the
likely
source of quite a few 'out of memory' errors and performance issues in R.
data.table doesn't do that internally, and it's syntax is pretty easy.

> tmp <- data.table(index = gl(2,20), foo = rnorm(40))

> tmp[, .SD[head(order(-foo),5)], by=index]
      index index.1       foo
 [1,]     1       1 1.9677303
 [2,]     1       1 1.2731872
 [3,]     1       1 1.1100931
 [4,]     1       1 0.8194719
 [5,]     1       1 0.6674880
 [6,]     2       2 1.2236383
 [7,]     2       2 0.9606766
 [8,]     2       2 0.8654497
 [9,]     2       2 0.5404112
[10,]     2       2 0.3373457
> 

As you can see it currently repeats the group column which is a
shame (on the to do list to fix).

Matthew

http://datatable.r-forge.r-project.org/


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-and-subsetting-tp2547360p2548319.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to