[R] Parallel computing with the snow package: external file I/O possible?

Waichler, Scott R Scott.Waichler at pnl.gov
Tue Mar 14 00:50:13 CET 2006


I am trying to do model autocalibration using the snow and rgenoud
packages.  The function I want to run in task-parallel fashion across
multiple machines is one that pre- and post-processes data and runs an
external model code.  My problem is that external file I/O is happening
only in the master node and not in the slaves.  I have followed Jasjeet
Sekhon's suggestion to test the cluster setup, and that is fine:

> library(snow)
> #pick two machines
> cl <- makeCluster(c("moab","escalante"))
> clusterCall(cl, sin, 2)
> The output should be:
> > clusterCall(cl, sin, 2)
> [[1]]
> [1] 0.9092974
> [[2]]
> [1] 0.9092974

I do indeed get the above result, so I presume the network setup is ok.
Next I tested a function that creates a file.  Here is the code that I
sourced from the master ("moab"):

# begin script

cl <- makeCluster(c("moab", "escalante"), type="SOCK")

# Define base pathname for output from my.test() 
base.dir <- "./test"

# Define a function that includes some file I/O 
my.test <- function(base.dir) {
  this.host <- as.character(system("hostname")) # to tag the node that
makes the file
  this.rnd <- sample(1:1e6, 1)  # to be 'sure' the files have different
  test.file <- paste(sep="", base.dir, "_", this.host, "_", this.rnd)
}  # end my.test()

g <- clusterCall(cl, my.test, base.dir)
#  end script  

The output (g) was as follows:
[1] TRUE

[1] TRUE

But there was only one file created, which I suspect is by the master
node.  A second file was not created by the process on the slave.  Also,
system("hostname") returns the number 0 for moab instead of the name.
Any ideas as to what might be wrong?  

Scott Waichler
scott.waichler _at_ pnl.gov

More information about the R-help mailing list