[R-sig-hpc] Question on foreach package

Mon Jul 18 21:48:25 CEST 2011

On Mon, 2011-07-18 at 12:25 -0700, Megh Dal wrote:
> As per the documentation of foreach package, if I use "%dopar%" then computation happens parallaly and on the contrary for %do%", it happens sequentially. Here, I tried both "%dopar%" and "%do%" for one of the examples given in the help page of ?foreach:
> 
> > a <- matrix(1:1600, 40, 40)
> > b <- t(a)
> > system.time(foreach(b=iter(b, by='col'), .combine=cbind) %dopar%   (a %*% b))
>    user  system elapsed 
>    0.04    0.00    0.05 
> > a <- matrix(1:1600, 40, 40)
> > b <- t(a)
> > system.time(foreach(b=iter(b, by='col'), .combine=cbind) %do%   (a %*% b))
>    user  system elapsed 
>    0.05    0.00    0.05 
> 
> However surprisingly, I did not see any improvement in the computation time. I am using windows vista with dual core CPU (I think it is dual core as when I open Task manager -> Performance, I see there are 2 windows for CPU Usage History......... I am correct that it is dual core, right?) Therefore as it is dual core, shouldn't the computation time with "%dopar%" will be half of "%do%"?
> 
> Am I missing something?
> 
> Your help will be highly appreciated.

It doesn't look like you registered a parallel backend for foreach (also
per the documentation).

See, for example, doMC, doSMP, doRedis, doMPI as parallel backends for
foreach. you need to call at a registerDo* function, or %dopar% will
behave exactly like %do%. I use %dopar% in all my programming, test in
single-threaded mode, and then register a parallel backend so that I get
parallel execution 'for free'. 

Your example is a trivial 'toy' one too, so communication costs may take
as long as the calculation, but we'll leave that aside for now.

-- 
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock