hi Steve, > * Steve Lianoglou <znvyvatyvfg.ubarl...@tznvy.pbz> [2012-11-26 16:08:59 > -0500]: > On Mon, Nov 26, 2012 at 3:13 PM, Sam Steingold <s...@gnu.org> wrote: >>> * Steve Lianoglou <znvyvatyvfg.ubarl...@tznvy.pbz> [2012-11-19 13:30:03 >>> -0800]: >>> >>> For instance, if you want the min and max of `delay` within each group >>> defined by `share.id`, and let's assume `infl` is a data.frame, you >>> can do something like so: >>> >>> R> as.data.table(infl) >>> R> setkey(infl, share.id) >>> R> result <- infl[, list(min=min(delay), max=max(delay)), by="share.id"] >> >> perfect, thanks. >> alas, the resulting table does not contain the share.id column. >> do I need to add something like "id=unique(share.id)" to the list? >> also, if there is a field in the original table infl which only depends >> on share.id, how do I add this unique value to the summary? >> it appears that "count=unique(country)" in list() does what I need, but >> it slows down the process. > > Hmm ... I think it should be there, but I'm having a hard time > remember what you want. > > Could you please copy paste the output of `(head(infl, 20))` as > well as an approximation of what the result is that you want.
this prints all the levels for all the factor columns and takes megabytes. --8<---------------cut here---------------start------------->8--- > f <- data.frame(id=rep(1:3,4),country=rep(6:8,4),delay=1:12) > f id country delay 1 1 6 1 2 2 7 2 3 3 8 3 4 1 6 4 5 2 7 5 6 3 8 6 7 1 6 7 8 2 7 8 9 3 8 9 10 1 6 10 11 2 7 11 12 3 8 12 > f <- as.data.table(f) > setkey(f,id) > delays <- > f[,list(min=min(delay),max=max(delay),count=.N,country=unique(country)),by="id"] > delays id min max count country 1: 1 1 10 4 6 2: 2 2 11 4 7 3: 3 3 12 4 8 --8<---------------cut here---------------end--------------->8--- this is still too slow, apparently because of unique. how do I speed it up? Thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://iris.org.il http://ffii.org http://pmw.org.il http://mideasttruth.com Programming is like sex: one mistake and you have to support it for a lifetime. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.