[R-sig-hpc] foreach + doMC not fully parallel?

Brian D Peyser PhD bpeyser at jhmi.edu
Fri Aug 27 00:14:12 CEST 2010


Hi everyone,

I am trying to run code in parallel across 4 cores with foreach and
doMC. The code runs fine, and it will spawn 4 R processes, but these
processes aren't all on separate cores. Often, I will see one core maxed
with each R process taking about 24% of the CPU. I also see something
like 96%, 98%, 48%, 48% (3 of 4 cores being used) or 32%, 32%, 96%, 32%.
(2 cores). This may change during the run, and I almost never see 4
cores getting maxed.

I am running like this (pseudo-code):

> library("doMC")
> registerDoMC(cores=4) # I have a 4-core processor
> 	# Read in data....
> foreach(obj=gsub(".txt", "",
files), .inorder=FALSE, .errorhandling="pass", .options.multicore =
list(preschedule = FALSE)) %dopar% {
> 	# Do lots of computation with get(obj)
> 	# Output a PNG plot and some text
> }

I have also tried with preschedule=TRUE, which has the same problem,
though then a process or two will end early since it has been going on
its own core instead of sharing. This is on Linux (Ubuntu 10.04 64-bit
desktop edition), and I am running R from Emacs with ESS (maybe I should
try from the terminal instead--hadn't thought to do that). Any thoughts
that could help?

Thanks,

Brian

(Here's my session info):

> sessionInfo()
R version 2.11.1 (2010-05-31) 
x86_64-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C             
 [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8    
 [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8   
 [7] LC_PAPER=en_US.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base     

other attached packages:
[1] doMC_1.2.1      foreach_1.3.0   codetools_0.2-2 iterators_1.0.3
[5] multicore_0.1-3



More information about the R-sig-hpc mailing list