> * Steve Lianoglou [2012-11-27 12:53:23
> -0500]:
> On Tue, Nov 27, 2012 at 11:29 AM, Sam Steingold wrote:
>>> * Steve Lianoglou [2012-11-26 19:47:25
>>> -0500]:
> [snip]
>>> It just occurred to me that this is even better:
>>>
>>> R> setkeyv(f, c("share.id", "delay"))
>>> R> result <- f[, l
Hi,
On Tue, Nov 27, 2012 at 11:29 AM, Sam Steingold wrote:
>> * Steve Lianoglou [2012-11-26 19:47:25
>> -0500]:
[snip]
>> It just occurred to me that this is even better:
>>
>> R> setkeyv(f, c("share.id", "delay"))
>> R> result <- f[, list(min=delay[1L], max=delay[.N], count=.N,
>> country=cou
> * Steve Lianoglou [2012-11-26 19:47:25
> -0500]:
>
> On Monday, November 26, 2012, Sam Steingold wrote:
> [snip]
>
>>
>> there is precisely one country for each id.
>> i.e., unique(country) is the same as country[1].
>> thanks a lot for the suggestion!
>>
>> > R> result <- f[, list(min=min(dela
On Monday, November 26, 2012, Sam Steingold wrote:
[snip]
>
> there is precisely one country for each id.
> i.e., unique(country) is the same as country[1].
> thanks a lot for the suggestion!
>
> > R> result <- f[, list(min=min(delay), max=max(delay),
> > count=.N,country=country[1L]), by="share.i
Hi,
> * Steve Lianoglou [2012-11-26 17:32:21
> -0500]:
>
>> --8<---cut here---start->8---
>>> f <- data.frame(id=rep(1:3,4),country=rep(6:8,4),delay=1:12)
>>> f
>>id country delay
>> 1 1 6 1
>> 2 2 7 2
>> 3 3 8 3
>> 4
Hi,
On Mon, Nov 26, 2012 at 4:57 PM, Sam Steingold wrote:
[snip]
>> Could you please copy paste the output of `(head(infl, 20))` as
>> well as an approximation of what the result is that you want.
Don't know how "dput" got clipped in your reply from the quoted text I
wrote, but I actually asked
hi Steve,
> * Steve Lianoglou [2012-11-26 16:08:59
> -0500]:
> On Mon, Nov 26, 2012 at 3:13 PM, Sam Steingold wrote:
>>> * Steve Lianoglou [2012-11-19 13:30:03
>>> -0800]:
>>>
>>> For instance, if you want the min and max of `delay` within each group
>>> defined by `share.id`, and let's assum
Hi Sam,
On Mon, Nov 26, 2012 at 3:13 PM, Sam Steingold wrote:
> Hi,
>
>> * Steve Lianoglou [2012-11-19 13:30:03
>> -0800]:
>>
>> For instance, if you want the min and max of `delay` within each group
>> defined by `share.id`, and let's assume `infl` is a data.frame, you
>> can do something like
Hi,
> * Steve Lianoglou [2012-11-19 13:30:03
> -0800]:
>
> For instance, if you want the min and max of `delay` within each group
> defined by `share.id`, and let's assume `infl` is a data.frame, you
> can do something like so:
>
> R> as.data.table(infl)
> R> setkey(infl, share.id)
> R> result <
On Nov 19, 2012, at 1:25 PM, Sam Steingold wrote:
> Thanks Steve,
> what is the analogue of .N for min and max?
?seq
> i.e., what is the data.table's version of
> aggregate(infl$delay,by=list(infl$share.id),FUN=min)
> aggregate(infl$delay,by=list(infl$share.id),FUN=max)
> DT[, list( max(v)),
Hi,
On Mon, Nov 19, 2012 at 1:25 PM, Sam Steingold wrote:
> Thanks Steve,
> what is the analogue of .N for min and max?
> i.e., what is the data.table's version of
> aggregate(infl$delay,by=list(infl$share.id),FUN=min)
> aggregate(infl$delay,by=list(infl$share.id),FUN=max)
> thanks!
It would be
Thanks Steve,
what is the analogue of .N for min and max?
i.e., what is the data.table's version of
aggregate(infl$delay,by=list(infl$share.id),FUN=min)
aggregate(infl$delay,by=list(infl$share.id),FUN=max)
thanks!
Sam.
On Fri, Sep 14, 2012 at 3:40 PM, Steve Lianoglou
wrote:
> Hi,
>
> On Fri, Sep
Hi,
On Fri, Sep 14, 2012 at 4:26 PM, Dennis Murphy wrote:
> Hi:
>
> This should give you some idea of what Steve is talking about:
>
> library(data.table)
> dt <- data.table(x = sample(10, 1000, replace = TRUE),
> y = rnorm(1000), key = "x")
> dt[, .N, by = x]
> syst
Hi:
This should give you some idea of what Steve is talking about:
library(data.table)
dt <- data.table(x = sample(10, 1000, replace = TRUE),
y = rnorm(1000), key = "x")
dt[, .N, by = x]
system.time(dt[, .N, by = x])
...on my system, dual core 8Gb RAM running Win7 6
n...@r-project.org] On
> Behalf
> Of Steve Lianoglou
> Sent: Friday, September 14, 2012 12:41 PM
> To: s...@gnu.org; r-help@r-project.org
> Subject: Re: [R] aggregate() runs out of memory
>
> Hi,
>
> On Fri, Sep 14, 2012 at 3:26 PM, Sam Steingold wrote:
> > I h
Hi,
On Fri, Sep 14, 2012 at 3:26 PM, Sam Steingold wrote:
> I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17
> columns).
> I want to get the result of
> table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x)
> alas, aggregate has been running for ~30 minute, RSS is 14G,
I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17 columns).
I want to get the result of
table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x)
alas, aggregate has been running for ~30 minute, RSS is 14G, VIRT is
24.3G, and no end in sight.
both V1 and V2 are characters (not
17 matches
Mail list logo