[Bioc-devel] BiocParallel

Ryan C. Thompson rct at thompsonclan.org
Fri Nov 16 20:44:06 CET 2012


You don't have to use foreach directly. I use foreach almost 
exclusively through the plyr package, which uses foreach internally to 
implement parallelism. Like you, I'm not particularly fond of the 
foreach syntax (though it has some nice features that come in handy 
sometimes).

The appeal of foreach is that it supports pluggable parallelizing 
backends, so you can (in theory) write the same code and parallelize it 
across multiple cores, or across an entire cluster, just by plugging in 
different backends.

On Fri 16 Nov 2012 10:17:24 AM PST, Michael Lawrence wrote:
> I'm not sure I understand the appeal of foreach. Why not do this
> within the functional paradigm, i.e, parLapply?
>
> Michael
>
> On Fri, Nov 16, 2012 at 9:41 AM, Ryan C. Thompson
> <rct at thompsonclan.org <mailto:rct at thompsonclan.org>> wrote:
>
>     You could write a %dopar% backend for the foreach package, which
>     would allow any code using foreach (or plyr which uses foreach) to
>     parallelize using your code.
>
>     On a related note, it might be nice to add Bioconductor-compatible
>     versions of foreach and the plyr functions to BiocParallel if
>     they're not already compatible.
>
>
>     On 11/16/2012 12:18 AM, Hahne, Florian wrote:
>
>         I've hacked up some code that uses BatchJobs but makes it look
>         like a
>         normal parLapply operation. Currently the main R process is
>         checking the
>         state of the queue in regular intervals and fetches results
>         once a job has
>         finished. Seems to work quite nicely, although there certainly
>         are more
>         elaborate ways to deal with the synchronous/asynchronous
>         issue. Is that
>         something that could be interesting for the broader audience?
>         I could add
>         the code to BiocParallel for folks to try it out.
>         The whole thing may be a dumb idea, but I find it kind of
>         useful to be
>         able to start parallel jobs directly from R on our huge SGE
>         cluster, have
>         the calling script wait for all jobs to finish and then
>         continue with some
>         downstream computations, rather than having to manually check
>         the job
>         status and start another script once the results are there.
>         Florian
>
>
>



More information about the Bioc-devel mailing list