[R-SIG-Mac] multicore package: collecting results

Mike Lawrence Mike.Lawrence at dal.ca
Wed Jun 29 20:59:43 CEST 2011


Is the slowdown happening while mclapply runs or while you're doing
the rbind? If the latter, I wonder if the code below is more efficient
than using rbind inside a loop:

my_df = do.call( rbind , my_list_from_mclapply )



On Wed, Jun 29, 2011 at 3:34 PM, Vincent Aubanel <v.aubanel at laslab.org> wrote:
> Hi all,
>
> I'm using mclapply() of the multicore package for processing chunks of data in parallel --and it works great.
>
> But when I want to collect all processed elements of the returned list into one big data frame it takes ages.
>
> The elements are all data frames having identical column names, and I'm using a simple rbind() inside a loop to do that. But I guess it makes some expensive checking computations at each iteration as it gets slower and slower as it goes. Writing out to disk individual files, concatenating with the system and reading back from disk the resulting file is actually faster...
>
> Is there a magic argument to rbind() that I'm missing, or is there any other solution to collect the results of parallel processing efficiently?
>
> Thanks,
> Vincent
>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>



More information about the R-SIG-Mac mailing list