Re: [R] Please explain "do.call" in this context, or critique to "stack this list faster"

Hadley Wickham Sat, 04 Sep 2010 19:51:39 -0700

> One common way around this is to pre-allocate memory and then to
> populate the object using a loop, but a somewhat easier solution here
> turns out to be ldply() in the plyr package. The following is the same
> idea as do.call(rbind, l), only faster:
>
>> system.time(u3 <- ldply(l, rbind))
>   user  system elapsed
>   6.07    0.01    6.09


I think all you want here is rbind.fill:

> system.time(a <- rbind.fill(l))
   user  system elapsed
  1.426   0.044   1.471

> system.time(b <- do.call("rbind", l))
   user  system elapsed
     98      60     162

> all.equal(a, b)
[1] TRUE

This is considerably faster than do.call + rbind because I spend a lot
of time working out how to do this most efficiently. You can see the
underlying code at http://github.com/hadley/plyr/blob/master/R/rbind.r
- it's relatively straightforward except for ensuring the output
columns are the same type as the input columns.  This is a good
example where optimised R code is much faster than C code.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Please explain "do.call" in this context, or critique to "stack this list faster"

Reply via email to