[R-sig-hpc] How to check if %dopar% really run parallel?
Brian G. Peterson
brian at braverock.com
Tue May 4 13:11:19 CEST 2010
Well, typically you can tell because all your CPU's go to 100% on your system
monitor(s).
This is what is called a 'toy example'. The vignettes for foreach go over the
fact that there is no free lunch, and for parallelism to make sense, you need
to minimize communication costs between the master and slave nodes. In your
example, the communication cost, and the cost of collecting the data, will be
larger than the cost of doing the calculation, making parallelism actually
detrimental. A recent discussion on 'chunking' on this list should provide
pointers on what you can do when your application involves a large number of
very-small calculations to minimize communication overhead.
While I must assume that your actual problem is more complex than replicating
the same number one million times, I suggest you (re)read the vignettes that
come with foreach. I also suggest that when you post next, tell the list a
little more about your actual application that you wish to parallelize, as this
should allow the list to give you better guidance.
Regards,
- Brian
Mario Valle wrote:
> Is there any way to check that %dopar% really runs parallel?
> The following code (on a dual core laptop running windows+R 2.11.0pat
> and on Linux+R2.11.0) runs %dopar% more slowly than the same %do% code.
> BTW, if you see any obvious mistake in the code...
> Thanks!
> mario
>
>
> library(doSNOW)
> library(foreach)
>
> fun <- function() for(q in 1:1000000) sqrt(3)
>
> system.time(times(10000) %do% fun, gcFirst = TRUE)
> # user system elapsed
> # 5.74 0.01 6.24
>
> cl <- makeCluster(2, type = "SOCK")
> registerDoSNOW(cl)
>
> system.time(times(10000) %dopar% fun, gcFirst = TRUE)
> # user system elapsed
> # 7.89 0.19 9.01
>
> stopCluster(cl)
>
--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
More information about the R-sig-hpc
mailing list