[R-sig-hpc] parallel execution on a single machine: sockets vs multicore?

Brian G. Peterson brian at braverock.com
Tue Aug 21 10:43:40 CEST 2012


On 08/20/2012 06:52 PM, Peter Langfelder wrote:
> I would like to parallelize some of my R code using either the socket
> cluster approach or the parallel execution using multicore. At this
> point I am leaning towards sockets since they also work on Windows
> which would make my code  more portable. I am curious whether there
> are some disadvantages of sockets vs. multicore - I know that with
> multicore, worker processes don't incur extra memory overhead for
> objects they use as read-only (because of modern OS copy-on-write
> approach). Is that also true for a socket cluster run on a single
> machine?
>
> Thanks in advance for any insights.

I recommend using the 'foreach' package for any code you intend to 
distribute to others, or for code which you may choose to switch from 
one parallelization backend to another (in your example because of 
switching from Windows to linux/Mac).

This allows you to write the code once using 'foreach' and %dopar%, 
fails gracefully to a single thread, and supports many different 
parallel back ends, notably in this case 'doParallel', which will 
automatically use multicore on *nix and sockets on Windows with minimal 
to no intervention on your part.

Note that there is a small amount of overhead for the flexibility of 
multiple back ends, but I find the advantages in a lot of code I write 
to outweigh this significantly.

One additional example is that you can debug single threaded, easily 
expand to multiple cores on a single machine for the next round of 
testing, then distribute over a cluster of many nodes (using e.g. 
doRedis or doMPI back ends).

I've always thought the proliferation of different *apply methods in R 
with slightly different syntaxes for different parallelization 
approaches a very messy architectural choice as well, requiring me to 
perform surgery on working code to change parallelization methods...


Regards,

    - Brian

-- 
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock



More information about the R-sig-hpc mailing list