# [R-sig-hpc] resolved Rmpi with simulation study

Martin Morgan mtmorgan at fhcrc.org
Mon Nov 9 18:31:39 CET 2009

```Hodgess, Erin wrote:
> Hi again.
>
> I calculated the closest larger number of sims based on the mpi.comm.size and did a loop based on that.

A different strategy is to aim first for an lapply implementation of
your overall program (for single threaded and to get the syntax correct)
and then simply change to mpi.parLapply. Use the job.num argument to
determine how to divide the list of tasks among nodes (if communication
costs are important then leave as-is, which divides the tasks as evenly
as possible amongst all nodes; the other extreme is
job.num=length(<whatever lapply is iterating through>), which will
allocate one task at a time, maximizing communication but providing
fine-grained 'load balancing' appropriate if each simulation has a very
unpredictable time for execution).

If you think carefully about your lapply, and arrange so that the
function takes in all required arguments, then there is no need for
communication costs, mpi.parLapply does all the work

For instance,

myMatrix <- matrix(rnorm(1000000), 100) # 100 x 1000
res <- mpi.parSapply(seq_len(nrow(myMatrix)), function(i, m) {
## do the real work here, we'll do something inappropriate
mean(m[i,])
}, myMatrix)
all.equal(res, rowMeans(myMatrix))

forwards myMatrix efficiently.

Martin

>
> I got about 4 times speedup with 26 nodes, 2 times speedup with 7 nodes.
>
> The communication must be the slowdown factor here.
>
> But for what it's worth, here are the functions, as well:
>
> [hodgess at grid bin]\$ cat linsim.R
> chow1 <- function() {
> 	x <- bigx[,fold]
> 	y <- bigy[,fold]
> 	nm <- length(x)
> 	n <- nm/m
> 	xq <- yq <- matrix(0,nrow=n,ncol=1)
> 	c1 <- (1/m)*cdgen(n,m)
> 	xq <- c1 %*% x
> 	yq <- c1 %*% y
> 	vv <- c1 %*% t(c1)
> 	vvi <- solve(vv)
> 	xvx <- t(xq) %*% vvi %*% xq
> 	xvxi <- solve(xvx)
> 	beta <- xvxi %*% t(xq) %*% vvi %*% yq
> 	res1 <- yq - xq %*% beta
> 	cv1 <- t(c1) %*% vvi
> 	newy <- x %*% beta + cv1%*%res1
> 	w <- sum((x-newy)^2)/360
> 	return(w)
> }
>
>
>
>
> cdgen <- function(n,m) {
> 	x <- rep(1,m)
> 	nm <- n*m
> 	cd <- matrix(0,nrow=n,ncol=nm)
> 	xg <- seq(from=1,by=m,length=n)
> 	xh <- seq(from=m,by=m,length=n)
> 	for(i in 1:n) {
> 		cd[i,xg[i]:xh[i]] <- x
> 	}
> 	return(cd)
> }
>
>
> linsim <- function(n1=10,phi1=0.5) {
> library("Rmpi")
>
> # Notice we just say "give us all the slaves you've got."
> mpi.spawn.Rslaves()
>
> if (mpi.comm.size() < 2) {
>     print("More slave processes are required.")
>     mpi.quit()
>     }
>
> .Last <- function(){
>         if (mpi.comm.size(1) > 0){
>             print("Please use mpi.close.Rslaves() to close slaves.")
>             mpi.close.Rslaves()
>         }
>         print("Please use mpi.quit() to quit R")
>         .Call("mpi_finalize")
>     }
>
> }
>
>
> 	w <- NULL
> 	m <- 3
> #Fix up closest multiple from mpi.comm.size
> 	n2 <- ceiling(n1/(mpi.comm.size()))
> 	n1 <- n3 <- n2*(mpi.comm.size())
> 	xx <- min(mpi.comm.size(),n1)
> 	cat("xx",xx,n1,n2,"\n")
>
> 	iz <- 0
> #run loop based on mpi.comm.size
> 	while(n1 >=1) {
> 	bigx <- matrix(arima.sim(list(order=c(1,0,0),ar=phi1),360*xx),
> 		nrow=360,ncol=xx)
> 	bigy <- matrix(arima.sim(list(order=c(1,0,0),ar=phi1),360*xx),
> 		nrow=360,ncol=xx)
> 	mpi.bcast.Robj2slave(phi1)
> 	mpi.bcast.Robj2slave(m)
> 	mpi.bcast.Robj2slave(chow1)
> 	mpi.bcast.Robj2slave(cdgen)
> 	mpi.bcast.Robj2slave(bigx)
> 	mpi.bcast.Robj2slave(bigy)
> 	mpi.bcast.cmd(fold <- mpi.comm.rank())
> 	mpi.bcast.cmd(chow1)
>
> 	newone <- mpi.remote.exec(chow1())
> #	newone <- chow1(x=x1,y=y1,m=3)
> 	w <- c(w,unlist(newone))
> 	n1 <- n1 - xx
> 	xx <- min(n1,mpi.comm.size())
> 	}
> 	zzz <- list(total=n3,mean=mean(w,na.rm=TRUE),sd=sd(w,na.rm=TRUE)/sqrt(360))
> 	return(zzz)
> }
>
>
> Thanks,
> Erin
>
>
> Erin M. Hodgess, PhD
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: hodgesse at uhd.edu
>
>
>
> -----Original Message-----
> From: Brian G. Peterson [mailto:brian at braverock.com]
> Sent: Mon 11/9/2009 3:56 AM
> To: Hodgess, Erin
> Cc: r-sig-hpc at r-project.org
> Subject: Re: [R-sig-hpc] resolved Rmpi with simulation study
>
> Hodgess, Erin wrote:
>> I came up with a solution...not a great one, but it runs.
>
> So, for closure, and the archives, what was the solution you came up with and
> your diagnosis of the problem?
>
> I have typically used 'rsprng' or 'random' to get random numbers that are
> different on different nodes in a parallel operation.
>
> Regards,
>
>     - Brian
>

--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

```