[R] hwo to speed up "aggregate"

analys...@hotmail.com Wed, 26 Jan 2011 04:00:29 -0800

I have

> df
   quantity branch client       date  name
1        10      1      1 2010-01-01   one
2        20      2      1 2010-01-01   one
3        30      3      2 2010-01-01   two
4        15      4      1 2010-01-01   one
5        10      5      2 2010-01-01   two
6        20      6      3 2010-01-01 three
7      1000      1      1 2011-01-01   one
8      2000      2      1 2011-01-01   one
9      3000      3      2 2011-01-01   two
10     1500      4      1 2011-01-01   one
11     1000      5      2 2011-01-01   two
12     2000      6      3 2011-01-01 three


I want to aggregate away the branch. I followed a suggestion by Gabor
(thanks) and did

> aggregate(list(quantity=df$quantity),list(client=df$client,date=df$date),sum)
  client       date quantity
1      1 2010-01-01       45
2      2 2010-01-01       40
3      3 2010-01-01       20
4      1 2011-01-01     4500
5      2 2011-01-01     4000
6      3 2011-01-01     2000

I want df$name also in the output and did what looked obvious:

> aggregate(list(quantity=df$quantity),list(client=df$client,date=df$date,name=df$name),sum)
  client       date  name quantity
1      1 2010-01-01   one       45
2      1 2011-01-01   one     4500
3      3 2010-01-01 three       20
4      3 2011-01-01 three     2000
5      2 2010-01-01   two       40
6      2 2011-01-01   two     4000

It seems to work, but slows down tremendously for a dataframe with
around a 1000 rows.

Could anyone explain what is going on and suggest a way out?

Thanks.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] hwo to speed up "aggregate"

Reply via email to