[R-sig-hpc] Parallel loop

Thu Mar 8 17:37:21 CET 2012

Thanks for the feedback.  Part of my problem is that I need the most
recent copy of R.  My IS team told me they can only get an earlier
copy using apt-get, but I think there must be a way so I've tasked them
with figuring it out.  

My OS is Kubuntu. I just thought it odd that it could only detect the 
number of CPUs rather than the number of cores.  I'm completely
new to parallel processing but it seems that something is not right
in the core detection.

One quetion I still have regrds child processes. I understand they 
finish on their own, but what if I need to kill those processing because 
of an obvious problem.  How can I do that?

----- Original Message -----
From: Simon Urbanek <simon.urbanek at r-project.org>
To: Scott Raynaud <scott.raynaud at yahoo.com>
Cc: "r-sig-hpc at r-project.org" <r-sig-hpc at r-project.org>
Sent: Tuesday, March 6, 2012 8:27 PM
Subject: Re: [R-sig-hpc] Parallel loop

Scott,

On Mar 6, 2012, at 4:19 PM, Scott Raynaud wrote:

> I'm trying to convert a loop in a simulation program to a parallel
> process using multicore.  It looks like this:
> 
> simus=1000 
> set.seed(12345)
> for(iter in 1:simus){
>  
> #bunch of code setting up random draws from various 
> #distributions for X matrix 
>  
> #Computaion of Y
>  
> (fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
>  
> estbeta<-fixef(fitmodel)
>  sdebeta<-sqrt(diag(vcov(fitmodel)))
>  for(l in 1:betasize)
>  {  
>    cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
>    if(beta[l]*cibeta>0)              powaprox[[l]]<-powaprox[[l]]+1
>      sdepower[l,iter]<-as.numeric(sdebeta[l])
>  }  
> ##------------------------------------------------------------------------##
>                          } ##  iteration end here
>  
>  
> The variable fitmodel is defined elsewhere and consists of
> fixed and random parts.  Betasize is the length of the vector of
> fixed effects.  Looks like near the bottom there's a 
> counter that increments if the slope estimate doesn't trap 0.  
> The next line records the standard deviation.  It's these two items
> that I need to recover.
>  
> So I'm thinking that I'm going to recode this way to run on
> a machine that has 4-Xeon CPUs:
>  
> library(multicore)
> options(cores=4)
> simus=1000
> sim.base.fun<-function(iter){  #replaces for(iter in 1:simus){
> 
> #bunch of code setting up random draws from various 
> #distributions for X matrix 
>  
> #Computaion of Y
>  
> (fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2))
>  
> estbeta<-fixef(fitmodel)
>  sdebeta<-sqrt(diag(vcov(fitmodel)))
>  for(l in 1:betasize)
>  {  
>    cibeta<-estbeta[l]-sgnbeta[l]*z1score*sdebeta[l]
>    if(beta[l]*cibeta>0)    
>    return(powaprox[[1]]<-powaprox[[1]]+1 #replaces powaprox[[l]]<-powaprox[[l]]+1
>    retunr(sedpower[1,iter]<-as.numeric(sdebeta[1]) #replaces sdepower[l,iter]<-as.numeric(sdebeta[l])
>      }
>                                      }#end of sim.base.fun
> sim.fun<_lapply(iter in 1:simus),sim.base.fun)
>  

No, the above won't work (all typos aside) since you are still using assignments. Inherently the above is not parallelizable as-is because of
powaprox[[l]]<-powaprox[[l]]+1
but that is trivially removed. Also the inner for loop is entirely superfluous.

Since your code was incomplete this is just a suggestion as we can't test anything (I'm not even checking if what you're doing makes any sense), but it should give you an idea of what to do:

sim.base <- function(iter) { 
  ## your unstated code goes here ...
  fitmodel <- lmer(modelformula,data,family=binomial(link=logit),nAGQ=2)
  estbeta <- fixef(fitmodel)
  sdebeta <- sqrt(diag(vcov(fitmodel)))
  list( powf = estbeta-sgnbeta*z1score*sdebeta*beta > 0,
        sdepower = sdebeta)
  )
}

res <- lapply(seq.int(simus), sim.base)
powf <- sapply(res, function(x) x$powf)
sdepower <- sapply(res, function(x) x$sdepower)
powapprox <- apply(powf, 1, sum)

To parallelize, replace lapply above with mclapply. 

>  Do I need a collect statement? 

No, if you use mclapply it's all done for you.

> How do I kill the worker processes? 

No, they go away once they're done computing.

> One thing I'm not accounting for is the random
> number generating process.  I've seen a couple of ways of
> doing this, one using mclapply and another using rlecuyer,
> but I'm not sure how to make them work.  Suggestions?
> Is there a way to replicate the same random numbers used 
> in the conventional loop in the parallel loop?

There is but you may need read up on that - you have to use a generator that produces a streaming sequence such that you can skip forward (also see recent discussion here and R-devel). For the above you may get away without it by simply setting different seeds in each iteration.

>  My machine has 4-Intel® Xeon® Processor X5650 CPUs.
> When I type:
>  
> library(multicore)
> multicore:::detectCores()
>  
> it returns 4.  However, the Intel cutsheet says each of these proceesors
> has 6 cores.  How can I acces them? 

You can set any number of cores you want using mc.cores=... argument - the detected cores are just a default if you don't specify anything.

What OS/distro is this? It is very unusual to see the number of cores reported wrongly by the system ...

Cheers,
Simon

> I know there's a point of diminishing
> returns but I can't figure out what it is unless I test it.
> 
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
> 
>