[Bioc-devel] BiocParallel
Ryan C. Thompson
rct at thompsonclan.org
Fri Nov 16 20:53:36 CET 2012
To be more specific, instead of:
library(parallel)
cl <- ... # Make a cluster
parLapply(cl, X, fun, ...)
you can do:
library(parallel)
library(doParallel)
library(plyr)
cl <- ...
registerDoParallel(cl)
llply(X, fun, ..., .parallel=TRUE)
On Fri 16 Nov 2012 11:44:06 AM PST, Ryan C. Thompson wrote:
> You don't have to use foreach directly. I use foreach almost
> exclusively through the plyr package, which uses foreach internally to
> implement parallelism. Like you, I'm not particularly fond of the
> foreach syntax (though it has some nice features that come in handy
> sometimes).
>
> The appeal of foreach is that it supports pluggable parallelizing
> backends, so you can (in theory) write the same code and parallelize
> it across multiple cores, or across an entire cluster, just by
> plugging in different backends.
>
> On Fri 16 Nov 2012 10:17:24 AM PST, Michael Lawrence wrote:
>> I'm not sure I understand the appeal of foreach. Why not do this
>> within the functional paradigm, i.e, parLapply?
>>
>> Michael
>>
>> On Fri, Nov 16, 2012 at 9:41 AM, Ryan C. Thompson
>> <rct at thompsonclan.org <mailto:rct at thompsonclan.org>> wrote:
>>
>> You could write a %dopar% backend for the foreach package, which
>> would allow any code using foreach (or plyr which uses foreach) to
>> parallelize using your code.
>>
>> On a related note, it might be nice to add Bioconductor-compatible
>> versions of foreach and the plyr functions to BiocParallel if
>> they're not already compatible.
>>
>>
>> On 11/16/2012 12:18 AM, Hahne, Florian wrote:
>>
>> I've hacked up some code that uses BatchJobs but makes it look
>> like a
>> normal parLapply operation. Currently the main R process is
>> checking the
>> state of the queue in regular intervals and fetches results
>> once a job has
>> finished. Seems to work quite nicely, although there certainly
>> are more
>> elaborate ways to deal with the synchronous/asynchronous
>> issue. Is that
>> something that could be interesting for the broader audience?
>> I could add
>> the code to BiocParallel for folks to try it out.
>> The whole thing may be a dumb idea, but I find it kind of
>> useful to be
>> able to start parallel jobs directly from R on our huge SGE
>> cluster, have
>> the calling script wait for all jobs to finish and then
>> continue with some
>> downstream computations, rather than having to manually check
>> the job
>> status and start another script once the results are there.
>> Florian
>>
>>
>>
More information about the Bioc-devel
mailing list