[R-sig-hpc] Parallel computing with snow
Gang Chen
gangchen6 at gmail.com
Fri Jan 2 22:55:03 CET 2009
I've been using parApply() in snow package for parallel computing with
the following lines in R 2.8.1:
library(snow)
nNodes <- 4
cl <- makeCluster(nNodes, type = "SOCK")
fm <- parApply(cl, myData, c(1,2), func1, ...)
Since I have a Mac OS X (version 10.4.11) with two dual-core
processors, I thought that I could run 4 simultaneous clusters.
However with the 1st job it seems only two clusters (362 and 364
below) were running with roughly the same CPU time (4th column) while
the other two clusters were pretty much idling (I assume the 1st row
with PID 357 was the main process with which I started R):
PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
357 R 0.0% 0:15.81 1 20 171 128M
5.66M 137M 169M
362 R 99.8% 11:41.07 1 19 129 28.8M
5.66M 38.3M 64.7M
364 R 100.3% 12:26.43 1 19 129 28.5M
5.66M 38.0M 64.7M
366 R 0.0% 0:01.67 1 19 120 23.7M
4.88M 32.3M 61.2M
368 R 0.0% 0:01.68 1 19 120 23.7M
4.88M 32.3M 61.2M
Why weren't 4 clusters split roughly equally in CPU time with two barely used?
I also tried a different job with fm <- parApply(cl, myData, c(1,2),
func2, ...), and the result is slightly different with all 4 clusters
more or less involved although they were still not distributed evenly
neither:
PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
413 R 0.0% 0:18.46 1 20 119 221M
4.57M 231M 250M
419 R 93.3% 2:53.62 1 19 80 18.0M
4.57M 29.1M 51.2M
421 R 93.6% 6:07.85 1 19 79 15.9M
4.57M 26.9M 50.2M
423 R 92.8% 5:12.13 1 19 79 17.4M
4.57M 28.4M 50.2M
425 R 93.3% 1:39.73 1 19 82 20.0M
4.57M 32.9M 53.2M
What gives? Why different usage of clusters between the two jobs?
All help is highly appreciated,
Gang
More information about the R-sig-hpc
mailing list