[Bioc-devel] BiocParallel
Ryan C. Thompson
rct at thompsonclan.org
Sat Nov 17 22:05:29 CET 2012
On 11/17/2012 02:39 AM, Ramon Diaz-Uriarte wrote:
> In addition to Steve's comment, is it really a good thing that "all code
> stays the same."? I mean, multiple machines vs. multiple cores are,
> often, _very_ different things: for instance, shared vs. distributed
> memory, communication overhead differences, whether or not you can assume
> packages and objects to be automagically present in the slaves/child
> process, etc. So, given they are different situations, I think it
> sometimes makes sense to want to write different code for each situation
> (I often do); not to mention Steve's hybrid cases ;-).
>
>
> Since BiocParallel seems to be a major undertaking, maybe it would be
> appropriate to provide a flexible approach, instead of hard wiring the
> foreach approach.
Of course there are cases where the same code simply can't work for both
multicore and multi-machine situations, but those generally don't fall
into the category of things that can be done using lapply. Lapply and
all of its parallelized buddies like mclapply, parLapply, and foreach
are designed for data-parallel operations with no interdependence
between results, and these kinds of operations generally parallelize as
well across machines as across cores, unless your network is not fast
enough (in which case you would choose not to use multi-machine
parallelism). If you want a parallel algorithm for something like the
disjoin method of GRanges, you might need to write some special purpose
code, and that code might be very different for multicore vs multi-machine.
So yes, sometimes there is a fundamental reason that you have to change
the code to make it run on multiple machines, and neither foreach nor
any other parallelization framework will save you from having to rewrite
your code. But often there is no fundamental reason that the code has to
change, but you end up changing it anyway because of limitations in your
parallelization framework. This is the case that foreach saves you from.
More information about the Bioc-devel
mailing list