[R-sig-hpc] Rmpi for non matrices results

Sat Feb 4 19:29:34 CET 2012

Hi!

I wonder if you know that the vignette for the new R parallel package
has a worked example on bootstrapping.

You use a lot of Rmpi functions here that I don't use, I'm usually
trying to write a function and distribute across notes with
clusterApply or such, but it is always interesting to see how other
people do it.

If this program does run for you, I'd say you should count your
blessing. The only problem is the format of the return? If the sum
total of your question is that the return value is a 30 x 10 matrix
instead of a 300x1 column, may I suggest you read ?stack and ?unstack
to just transform your output?  The plyr package has plenty of high
power tools to do that, but from what I see here you don't need them.

You could also collect your results in a list, rather than into the
columns of a matrix, and then it is easy to put all those numbers into
a vector (?unlist).

Now, I have a question for you.  When I run Rmpi jobs in our cluster
these days, I get a warning about forking. Do you see that too?
--------------------------------------------------------------------------
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process.  Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption.  The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.

The process that invoked fork was:

  Local host:          compute-2-11.local (PID 10326)
  MPI_COMM_WORLD rank: 0

If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--------------------------------------------------------------------------
~
On Tue, Jan 31, 2012 at 4:27 PM,  <khalfala at math.mcmaster.ca> wrote:
> Hi there,
> I am facing some problems in transforming my R code in order to be used as
> a parallel code. The main idea of my code is to get 1000 bootstrap
> confidence interval. It works by using a boot sample at each loop of the
> CI. I do have first a loop of size 1000 and at each loop we call a
> bootsample of size 500. This could last for 6-16 hours to get a set of
> 1000 CI. Now I am interested in transforming my code using the task push
> method. The thing that I dont get is how to relate my loop with each
> slave. I mean if I want to divide the work to be 100 CI at 10 slaves then
> how to relate the rank of the slave in my loop??
> I found this example:
>
> # Initialize MPI
> library("Rmpi")
>
> # Notice we just say "give us all the slaves you've got."
> mpi.spawn.Rslaves()
>
> if (mpi.comm.size() < 2) {
>    print("More slave processes are required.")
>    mpi.quit()
>    }
>
> .Last <- function(){
>    if (is.loaded("mpi_initialize")){
>        if (mpi.comm.size(1) > 0){
>            print("Please use mpi.close.Rslaves() to close slaves.")
>            mpi.close.Rslaves()
>        }
>        print("Please use mpi.quit() to quit R")
>        .Call("mpi_finalize")
>    }
> }
>
> # Function the slaves will call to perform a validation on the
> # fold equal to their slave number.
> # Assumes: thedata,fold,foldNumber,p
> foldslave <- function() {
>    # Get a task
>    task <- mpi.recv.Robj(mpi.any.source(),mpi.any.tag())
>    task_info <- mpi.get.sourcetag()
>    tag <- task_info[2]
>
>    # While task is not a "done" message. Note the use of the tag to indicate
>    # the type of message
>    while (tag != 2) {
>        # Perform the task.
>        foldNumber <- task$foldNumber
>
>        rss <- double(p)
>        for (i in 1:p) {
>            # produce a linear model on the first i variables on training
> data
>            templm <- lm(y~.,data=thedata[fold!=foldNumber,1:(i+1)])
>
>            # produce predicted data from test data
>            yhat <- predict(templm,newdata=thedata[fold==foldNumber,1:(i+1)])
>
>            # get rss of yhat-y
>            localrssresult <- sum((yhat-thedata[fold==foldNumber,1])^2)
>            rss[i] <- localrssresult
>            }
>
>        # Construct and send message back to master
>        result <- list(result=rss,foldNumber=foldNumber)
>        mpi.send.Robj(result,0,1)
>
>        # Get a task
>        task <- mpi.recv.Robj(mpi.any.source(),mpi.any.tag())
>        task_info <- mpi.get.sourcetag()
>        tag <- task_info[2]
>        }
>
>    junk <- 0
>    mpi.send.Robj(junk,0,2)
>    }
>
> # We're in the parent.
> # first make some data
> n <- 1000       # number of obs
> p <- 30         # number of variables
>
> # Create data as a set of n samples of p independent variables,
> # make a "random" beta with higher weights in the front.
> # Generate y's as y = beta*x + random
> x <- matrix(rnorm(n*p),n,p)
> beta <- c(rnorm(p/2,0,5),rnorm(p/2,0,.25))
> y <- x %*% beta + rnorm(n,0,20)
> thedata <- data.frame(y=y,x=x)
>
> fold <- rep(1:10,length=n)
> fold <- sample(fold)
>
> summary(lm(y~x))
>
> # Now, send the data to the slaves
> mpi.bcast.Robj2slave(thedata)
> mpi.bcast.Robj2slave(fold)
> mpi.bcast.Robj2slave(p)
>
> # Send the function to the slaves
> mpi.bcast.Robj2slave(foldslave)
>
> # Call the function in all the slaves to get them ready to
> # undertake tasks
> mpi.bcast.cmd(foldslave())
>
>
> # Create task list
> tasks <- vector('list')
> for (i in 1:10) {
>    tasks[[i]] <- list(foldNumber=i)
>    }
>
> # Make the round-robin list for slaves
> n_slaves <- mpi.comm.size()-1
> slave_ids <- rep(1:n_slaves, length=length(tasks))
>
> # Send tasks
> for (i in 1:length(tasks)) {
>    slave_id <- slave_ids[i]
>    mpi.send.Robj(tasks[[i]],slave_id,1)
>    }
>
> # Collect results
> rssresult <- matrix(0,p,10)
> for (i in 1:length(tasks)) {
>    message <- mpi.recv.Robj(mpi.any.source(),mpi.any.tag())
>    foldNumber <- message$foldNumber
>    results    <- message$result
>    rssresult[,foldNumber] <- results
>    }
>
> # Perform closing handshake
> for (i in 1:n_slaves) {
>    junk <- 0
>    mpi.send.Robj(junk,i,2)
>    }
>
> for (i in 1:n_slaves) {
>    mpi.recv.Robj(mpi.any.source(),2)
>    }
>
> # plot the results
> plot(apply(rssresult,1,mean))
>
> mpi.close.Rslaves()
> mpi.quit(save="no")
>
> In this example the result is given by 30x10 matrix in which they used 10
> slaves which represents the number of columns of the matrix. I tried
> looking for another example that dont deal with matrices but couldnt find
> any. I am wondering if the interest was let say in this case to get a
> column of 300 numbers and each slave will give a column of 30 numbers then
> how would we write the foldslave function and in this case how would we
> call the result from each slave and arrange it in one column?
> Thank you in advance.
> Laila
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas