[R] Rmpi performance

Martin Morgan mtmorgan at fhcrc.org
Fri Oct 13 18:40:51 CEST 2006


clusterCall invokes the same function on all three nodes. You have
basically discovered the communication costs of performing the
calculation in parallel.

You'll get the easiest gains from snow (and other parallel packages in
R) with 'embarrassingly parallel' problems, where the same algorithm is
applied to different data sets / slices of data. For performance gains
from a single call to op_mat, you'd have to do some serious parallel
algorithm development to distribute the data and computations
effectively.

Hope that helps,

Martin

Michela Cameletti <michela.cameletti at unibg.it> writes:

> Dear R users,
> we are trying to do some parallel computing using library(snow).
> In particular we have a cluster with 3 nodes
>
>>cl <- makeCluster(3, type = "MPI")
>         3 slaves are spawned successfully. 0 failed.
>
>
> and we want to compute the function op_mat (see below) first with the 
> master and then with the cluster using system.time for checking the 
> computational performance.
>
> op_mat = function(mat) {
>
> +           inv = solve(mat)
> +           det_inv = det(inversa)
> +           tr_inv  = sum(diag(inversa))
> +           return(list(c(det=det_inv,tr=tr_inv)))
> + }
>
>>nn = 3000
>>XX = matrix(rnorm(nn*nn),nn,nn)
> # with the master
>> system.time(op_matrici(XX))
> [1] 42.283  1.883 44.168  0.000  0.000
> # with the cluster
>> system.time(clusterCall(cl,op_matrici,XX))
> [1] 11.523 12.612 71.562  0.000  0.000
>
> You can see that using the master it takes 44.168 seconds for computing 
> the function on matrix XX while it takes 71.562 seconds (more time!!!) 
> with the cluster. Can you give us some advice in order to understand why 
> the cluster is slower than the master?
> Thank you very much in advance,
> bye
> Michela  and Marco
> Ps: we have a gigabit ethernet between the master and the nodes
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org



More information about the R-help mailing list