[R] Rmpi performance

Thomas Lumley tlumley at u.washington.edu
Fri Oct 13 18:39:54 CEST 2006


On Fri, 13 Oct 2006, Michela Cameletti wrote:

> Dear R users,
> we are trying to do some parallel computing using library(snow).
> In particular we have a cluster with 3 nodes
>
>> cl <- makeCluster(3, type = "MPI")
>        3 slaves are spawned successfully. 0 failed.
>
>
> and we want to compute the function op_mat (see below) first with the
> master and then with the cluster using system.time for checking the
> computational performance.
>
> op_mat = function(mat) {
>
> +           inv = solve(mat)
> +           det_inv = det(inversa)
> +           tr_inv  = sum(diag(inversa))
> +           return(list(c(det=det_inv,tr=tr_inv)))
> + }
>
>> nn = 3000
>> XX = matrix(rnorm(nn*nn),nn,nn)
> # with the master
>> system.time(op_matrici(XX))
> [1] 42.283  1.883 44.168  0.000  0.000
> # with the cluster
>> system.time(clusterCall(cl,op_matrici,XX))
> [1] 11.523 12.612 71.562  0.000  0.000
>
> You can see that using the master it takes 44.168 seconds for computing
> the function on matrix XX while it takes 71.562 seconds (more time!!!)
> with the cluster. Can you give us some advice in order to understand why
> the cluster is slower than the master?

clusterCall() evaluates the same call on each computer in the cluster, so 
it will always be slower than just evaluating on the master.  It is 
useful for setup that has to be performed on each machine, or for parallel 
evaluation of random functions (eg boostrapping, simulation)

To split up a single computation you have to do it explicitly, eg with 
parLapply, parSapply, and parApply, or parMM for parallel matrix 
multiplication. It's unlikely that you could speed up inverting a dense 
matrix even with gigabit ethernet for communication -- the success of 
ATLAS and Dr Goto's tuned BLAS libraries shows that the time taken for 
dense linear algebra can be dominated by communications overhead even 
between a CPU and its own memory.

 	-thomas



More information about the R-help mailing list