[R] Please explain "do.call" in this context, or critique to "stack this list faster"

Gabor Grothendieck ggrothendieck at gmail.com
Sat Sep 4 23:36:00 CEST 2010


On Sat, Sep 4, 2010 at 2:37 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:
> I've been doing some consulting with students who seem to come to R
> from SAS.  They are usually pre-occupied with do loops and it is tough
> to persuade them to trust R lists rather than keeping 100s of named
> matrices floating around.
>
> Often it happens that there is a list with lots of matrices or data
> frames in it and we need to "stack those together".  I thought it

This has nothing specifically to do with do.call but note that
R is faster at handling matrices than data frames.  Below
we see that rbind-ing 4 data frames takes over 100 times as
long as rbind-ing matrices with the same data:

> mylist <-  list(iris[-5], iris[-5], iris[-5], iris[-5])
> L <- lapply(mylist, as.matrix)
>
> library(rbenchmark)
> benchmark(
+ df = do.call("rbind", mylist),
+ mat = do.call("rbind", L),
+ order = "relative", replications = 250
+ )
  test replications elapsed relative user.self sys.self user.child sys.child
2  mat          250    0.01        1      0.02     0.00         NA        NA
1   df          250    1.06      106      1.03     0.01         NA        NA

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list