[R-sig-hpc] request help with replication and snowFT

Paul Johnson pauljohn32 at gmail.com
Sat Jul 23 18:46:20 CEST 2011


On Fri, Jul 22, 2011 at 6:58 PM, Hana Sevcikova <hanas at uw.edu> wrote:
> Paul,
>
> You need to pass the 'gentype' argument into clusterApplyFT, i.e.
>
> res<-clusterApplyFT(cl,rep(5,10), rnorm, gentype = "RNGstream")
>
> This is because clusterApplyFT has a different default generator than
> clusterSetupRNG.FT (I should change that). I guess it wasn't a big deal
> because these are meant to be internal functions used through the
> performParallel wrapper.
>
> Hana
>

Thanks.

Could you please explain more about the intended use of
parallelPerform? I tried, but could not find a convenient way to send
functions to the nodes before the main function would run.  I could
build everything into a package and install that on the nodes, but for
development and testing, that makes for a pretty tedious process.

I can't say what your students are willing to put up to use a cluster,
but mine are not so patient as that.  We want to just write functions,
send them out, see what happens.  In the snowFT documentation, there
are no examples like that,I had to do some guessing to see what might
work. Would you care to put an example like this in your
documentation?

And then explain how a user can grab any one arbitrary stream and
re-run it for interactive investigation of its properties.  When we
run this thing 1000 times and 2 are really off the usual result, we
want to dig in and try to see what happened.

Here's my test case:

### r: number of streams. Should be set as BIGGEST number of runs=streams
### you could ever want to replicate.  It sets a framework of streams
### that is the same on all nodes.  Here I have 33 streams, only 10 nodes.
### snowFT handles the problem of creating 33 separate streams, so there
### is one ready for each possible run, no matter which node is doing
### the work.
r <- 33
### cnt: number of nodes
cnt <- 10

cl <- makeClusterFT(cnt, type="MPI")

### From snowFT methods:
printClusterInfo(cl)

### Can use SNOW methods as well.
### Testing with SNOW methods: sends function to each system
clusterCall( cl, function() Sys.info()[c("nodename","machine")])

### Some user-written functions involved in a simulation
myA <- function( x ){
  2 *x
}

myB <- function( x ){
  3 * x
}


myC <- function( x, y){
  x + y
}

## The main function of interest
myNorm <- function (x){
  whew <-  myA(x)
  whewyou <- myB(whew)
  whewwho <- myC(whew, whewyou)
  y <- rnorm(whewwho)
  list(x, whew, whewyou, whewwho, y, sum(y))
}


mySeeds <- c(1231, 2323, 43435, 12123, 22442, 634654)
##create "x" vector.
myx <- sample(1:8, r, replace=T)

## Send functions to systems with SNOW functions
clusterExport(cl, "myA")
clusterExport(cl, "myB")
clusterExport(cl, "myC")

clusterSetupRNG.FT(cl, type = "RNGstream", streamper="replicate", n=r,
seed=mySeeds)
res1 <- clusterApplyFT(cl, x=myx, fun=myNorm, seed=mySeeds)

print(res1[[1]])




-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas



More information about the R-sig-hpc mailing list