[R-sig-hpc] foreach + doMC not fully parallel?
Brian D Peyser PhD
bpeyser at jhmi.edu
Fri Aug 27 00:14:12 CEST 2010
Hi everyone,
I am trying to run code in parallel across 4 cores with foreach and
doMC. The code runs fine, and it will spawn 4 R processes, but these
processes aren't all on separate cores. Often, I will see one core maxed
with each R process taking about 24% of the CPU. I also see something
like 96%, 98%, 48%, 48% (3 of 4 cores being used) or 32%, 32%, 96%, 32%.
(2 cores). This may change during the run, and I almost never see 4
cores getting maxed.
I am running like this (pseudo-code):
> library("doMC")
> registerDoMC(cores=4) # I have a 4-core processor
> # Read in data....
> foreach(obj=gsub(".txt", "",
files), .inorder=FALSE, .errorhandling="pass", .options.multicore =
list(preschedule = FALSE)) %dopar% {
> # Do lots of computation with get(obj)
> # Output a PNG plot and some text
> }
I have also tried with preschedule=TRUE, which has the same problem,
though then a process or two will end early since it has been going on
its own core instead of sharing. This is on Linux (Ubuntu 10.04 64-bit
desktop edition), and I am running R from Emacs with ESS (maybe I should
try from the terminal instead--hadn't thought to do that). Any thoughts
that could help?
Thanks,
Brian
(Here's my session info):
> sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-pc-linux-gnu
locale:
[1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
[3] LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8
[5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
[7] LC_PAPER=en_US.utf8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
base
other attached packages:
[1] doMC_1.2.1 foreach_1.3.0 codetools_0.2-2 iterators_1.0.3
[5] multicore_0.1-3
More information about the R-sig-hpc
mailing list