[R-sig-hpc] Spanning a memory intensive call to lme

Lockwood, Glenn glock at sdsc.edu
Wed Dec 18 02:38:28 CET 2013


Katharina,

I don't think that what you are trying to do can be accomplished nearly that easily.  Using Rmpi directly is not for the faint of heart, and I strongly recommend using a nicer interface like snow or snowfall first.

More importantly though, you have to explicitly send the data that your parallel workers will operate on to each of your worker nodes before telling them to execute lme.  If you are having trouble fitting your entire dataset into memory, Rmpi will not help you since you would still have to load the dataset into the master node before sending pieces of it out to the workers.  The act of sending pieces to the workers, and the act of having the workers understand what to do with those pieces, is also something you will have to code in yourself.

In general, using Rmpi (or snow, or other libraries which built upon Rmpi) to get around memory limitations is very challenging.  Rmpi is meant to address problems which are bounded by compute speed, not memory limits.  I'm afraid it'll be an uphill battle because there is very little magic in what Rmpi does.

Glenn


--
Glenn K. Lockwood, Ph.D.
User Services Group
San Diego Supercomputer Center
glock at sdsc.edu / (858) 246-1075

On Dec 16, 2013, at 2:31 AM, Katharina May <may.katharina at googlemail.com> wrote:

> Dear all,
> 
> I'm starting to use rmpi for a lme (nlme) model which is very memory
> intensive, and I want to run the lme call therefore on a linux cluster.
> However, I haven't exactly found a proper example how to accomplish this
> and I'm not really familiar with rmpi. Hence, I hope somebody might help
> with this I guess rather trivial problem...
> Here is the code I wanted to use, but I'm not sure if the lme call like it
> is written here, will actually be spanned across several hosts within the
> cluster to come by the memory problem?
> 
> 
> #Load libraries
> library(nlme)
> library(mgcv)
> 
> # Load the R MPI package if it is not already loaded.
> if  (!is.loaded("mpi_initialize")) {
>  library("Rmpi")
> }
> 
> # In case R exits unexpectedly, have it automatically clean up
> # resources taken up by Rmpi (slaves, memory, etc...)
> .Last <- function(){
>  if (is.loaded("mpi_initialize")){
>    if (mpi.comm.size(1) > 0){
>      print("Please use mpi.close.Rslaves() to close slaves.")
>      mpi.close.Rslaves()
>    }
>    print("Please use mpi.quit() to quit R")
>    .Call("mpi_finalize")
>  }
> }
> 
> #actual call
> xylemRohTimeBoth41.lme <- mpi.remote.exec(lme(sapflow ~ NthSampling,
> random= ~NthSampling|site/sensor, data=xylemRoh2011, method="REML",
> na.action=na.omit,  control=lmeControl(opt = "optim")) simplify = TRUE,
> comm = 1, ret = TRUE)
> 
> Any help is very much appreciated and sorry for this I guess very basic
> question...
> 
> Many thanks.
> 
>           Katharina
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc



More information about the R-sig-hpc mailing list