[R] snow and Rmpi, delayed starting-times at nodes

Luke Tierney luke at stat.uiowa.edu
Fri Nov 2 16:54:40 CET 2007


On Fri, 2 Nov 2007, Markus Schmidberger wrote:

> Hello,
>
> we use R version 2.6.0, Rmpi_0.5-5 and snow_0.2-9 and have a parallel call 
> like this:
>
> clusterApply(cluster, input.list, function(input, data1, type) {  ....   }, 
> data1, type )

Most likely the problem is the direct use of function(...) ... here.
This captures the local environment in ist closure, which is probably
very large.  Try defining the function you want to call at top level,
or define a top level function to create your function if you do want
a closure with specific tada captured.

luke

>
> We now have the problem, that the processes at the nodes start delayed. This 
> means for example, node 4 starts its calculation when node 1 is finished.(see 
> the attached figure)
> Therefore we have a lot of loss in our computation time.
>
> Our messages are not very big:
> input.list = list of vectors. Each vector has 10 strings
> data1 = one integer
> type = one string
>
> What can we do to improve the speed?
>
> Thanks
> Markus
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-help mailing list