[R-sig-hpc] Parallel computing with snow

Gang Chen gangchen6 at gmail.com
Fri Jan 2 22:55:03 CET 2009


I've been using parApply() in snow package for parallel computing with
the following lines in R 2.8.1:

   library(snow)
   nNodes <- 4
   cl <- makeCluster(nNodes, type = "SOCK")
   fm <- parApply(cl, myData, c(1,2), func1, ...)

Since I have a Mac OS X (version 10.4.11) with two dual-core
processors, I thought that I could run 4 simultaneous clusters.
However with the 1st job it seems only two clusters (362 and 364
below) were running with roughly the same CPU time (4th column) while
the other two clusters were pretty much idling (I assume the 1st row
with PID 357 was the main process with which I started R):

  PID COMMAND      %CPU   TIME   #TH #PRTS #MREGS RPRVT  RSHRD  RSIZE  VSIZE
  357         R             0.0%       0:15.81   1    20   171   128M
5.66M   137M   169M
  362         R            99.8%      11:41.07   1    19   129  28.8M
5.66M  38.3M  64.7M
  364         R           100.3%      12:26.43   1    19   129  28.5M
5.66M  38.0M  64.7M
  366         R             0.0%       0:01.67   1    19   120  23.7M
4.88M  32.3M  61.2M
  368         R             0.0%       0:01.68   1    19   120  23.7M
4.88M  32.3M  61.2M

Why weren't 4 clusters split roughly equally in CPU time with two barely used?

I also tried a different job with fm <- parApply(cl, myData, c(1,2),
func2, ...), and the result is slightly different with all 4 clusters
more or less involved although they were still not distributed evenly
neither:

  PID COMMAND      %CPU   TIME   #TH #PRTS #MREGS RPRVT  RSHRD  RSIZE  VSIZE
  413          R            0.0%       0:18.46   1    20   119   221M
4.57M   231M   250M
  419          R           93.3%       2:53.62   1    19    80  18.0M
4.57M  29.1M  51.2M
  421          R           93.6%       6:07.85   1    19    79  15.9M
4.57M  26.9M  50.2M
  423          R           92.8%       5:12.13   1    19    79  17.4M
4.57M  28.4M  50.2M
  425          R           93.3%       1:39.73   1    19    82  20.0M
4.57M  32.9M  53.2M

What gives? Why different usage of clusters between the two jobs?

All help is highly appreciated,
Gang



More information about the R-sig-hpc mailing list