[R-sig-hpc] mixing MP with MPI in R?

Stephen Weston stephen.b.weston at gmail.com
Tue Jan 5 18:59:20 CET 2010


It's very easy to combine the use of snow and multicore.  Here's a simple
example of a parallel colMeans function.  First I define a version that
uses multicore, and then call that from the cluster workers of a
snow program in a recursive-like way:

mcColMeans <- function(m) {
  unlist(mclapply(splitCols(m, ncol(m)), mean))
}

hybridColMeans <- function(m, cl) {
  unlist(clusterApply(cl, splitCols(m, length(cl)), mcColMeans))
}

The only trick is to load the multicore package on each of the cluster
workers before calling "hybridColMeans".  You can do that with a
function such as:

initCluster <- function(cl) {
  clusterCall(cl, function() library(multicore))
}

I've experimented with various ways of doing hybrid parallelism in
the foreach and doMPI packages, but as you can see, it's easy
to do yourself.  You just need to be familiar with different methods
of splitting your data so that you can efficiently get parallelism
across different machines using snow, across different cores using
multicore, and still use vector operations if possible at the lowest
level.

- Steve


On Mon, Jan 4, 2010 at 2:23 PM, Mark Kimpel <mwkimpel at gmail.com> wrote:
> I recently ran into a problem that was easier to solve using mutilcore
> compared with Rmpi because I had a large matrix that I was performing
> calculations on and it copying it for each process ate up my 12GB of memory.
> I now have access to 2 Linux machine, each a core i7 with 12GB of memory,
> and wonder how I might speed things up even faster by using some sort of
> combination of multicore and Rmpi/snow.
>
> I'm a novice at this, so perhaps the answer is obvious, but is it possible
> to spawn to multiple machines with non-shared memory but within each machine
> used shared memory? If my novice understanding is correct, the former uses
> openMPI and the latter openMP.
>
> If this is possible, a self-contained example would be appreciated, even if
> the calculations are so trivial as to not make the parallelization
> worthwhile in the example case
>
> Mark
>
> Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
> Indiana University School of Medicine
>
> 15032 Hunter Court, Westfield, IN  46074
>
> (317) 490-5129 Work, & Mobile & VoiceMail
> (317) 399-1219 Skype No Voicemail please
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>



More information about the R-sig-hpc mailing list