[R] performance of do.call("rbind")

Witold E Wolski wewolski at gmail.com
Mon Jun 27 18:54:50 CEST 2016


Hi Bert,

You are most likely right. I just thought that do.call("rbind", is
somehow more clever and allocates the memory up front. My error. After
more searching I did find rbind.fill from plyr which seems to do the
job (it computes the size of the result data.frame and allocates it
first).

best

On 27 June 2016 at 18:49, Bert Gunter <bgunter.4567 at gmail.com> wrote:
> The following might be nonsense, as I have no understanding of R
> internals; but ....
>
> "Growing" structures in R by iteratively adding new pieces is often
> warned to be inefficient when the number of iterations is large, and
> your rbind() invocation might fall under this rubric. If so, you might
> try  issuing the call say, 20 times, over 10k disjoint subsets of the
> list, and then rbinding up the 20 large frames.
>
> Again, caveat emptor.
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Jun 27, 2016 at 8:51 AM, Witold E Wolski <wewolski at gmail.com> wrote:
>> I have a list (variable name data.list) with approx 200k data.frames
>> with dim(data.frame) approx 100x3.
>>
>> a call
>>
>> data <-do.call("rbind", data.list)
>>
>> does not complete - run time is prohibitive (I killed the rsession
>> after 5 minutes).
>>
>> I would think that merging data.frame's is a common operation. Is
>> there a better function (more performant) that I could use?
>>
>> Thank you.
>> Witold
>>
>>
>>
>>
>> --
>> Witold Eryk Wolski
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



-- 
Witold Eryk Wolski



More information about the R-help mailing list