[R] parallel computing with 'foreach'

Lui ## lui.r.project at googlemail.com
Thu Jul 7 10:27:15 CEST 2011


Hello Stacey,

I do not know whether my answer comes late or not, just came across
your post now. I had a similar problem...

First: You might want to think about whether to try to parallelize the
thing or not. Unless coxph takes several minutes, it is probably of no
great help to parallelize it, because there are many jobs associated
with it. All workers need to be "taught" about the environment (the
functions and variables they need to know) and some coordination work
is necessary as well. So if every for-loop takes a longer time: you
may want to use foreach, otherwise there's no great benefit
(probably).

What you could do is save only the functions you need in a separate R
file and just have the workers initialize the functions you need for
that. So you split up your source code in two parts - one containing
the functions you need in the loop later and one that controls how the
functions work together...

You can try :
##declare a function that loads only the libraries and functions
necessary inside the loop
mysource <- function(envir, filename) source("source.R")

##tell the programm to have every worker execute that function
smpopts <- list(initEnvir = mysource)

##have it executed with the foreach loop
foreach (.....,.options.smp=smpopts){


Hope that helps...
Best

Lui


2011/7/1 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
> Type
>  ?foreach
> and read the whole help page - as the positng guide asked you to do before
> posting, you will find the line describing the argument ".packages".
>
> Uwe Ligges
>
>
>
> On 28.06.2011 21:17, Stacey Wood wrote:
>>
>> Hi all,
>> I would like to parallelize some R code and would like to use the
>> 'foreach'
>> package with a foreach loop.  However, whenever I call a function from an
>> enabled package outside of MASS, I get an error message that a number of
>> the
>> functions aren't recognized (even though the functions should be defined).
>> For example:
>>
>> library(foreach)
>> library(doSMP)
>> library(survival)
>> # Create the simplest test data set
>> test1<- list(time=c(4,3,1,1,2,2,3),
>>               status=c(1,1,1,0,1,1,0),
>>               x=c(0,2,1,1,1,0,0),
>>               sex=c(0,0,0,0,1,1,1))
>> # Fit a stratified model
>> coxph(Surv(time, status) ~ x + strata(sex), test1)
>>
>> w<- startWorkers()
>> registerDoSMP(w)
>> foreach(i=1:3) %dopar% {
>> # Fit a stratified model
>> fit<-coxph(Surv(time, status) ~ x + strata(sex), test1)
>> summary(fit)$coef[i]
>> }
>> stopWorkers(w)
>> ####Error message:
>> Error in { : task 1 failed - "could not find function "coxph""
>>
>>
>> If I call library(survival) inside the foreach loop, everything runs
>> properly.  I don't think that I should have to call the package
>> iteratively
>> inside the loop.  I would like to use a foreach loop inside code for my
>> own
>> package, but this is a problem since I can't call my own package in the
>> source code for the package itself!  Any advice would be appreciated.
>>
>> Thanks,
>> Stacey
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list