[R-sig-hpc] Matrix multiplication

Claudia Beleites claudia.beleites at ipht-jena.de
Thu Mar 15 18:41:32 CET 2012

Simon and Paul,

seems I have trouble with some part of the configuration on the server:
I'm not able any longer to change the number of threads for the
gotoblas, it always stays at 6 (which is fortunately a quite sensible
So, before believing what I wrote yesterday, please try yourself.

>> node). However, snow (and multicore) need more RAM
> Snow does but not multicore - the benefit of multicore is that all
> data at the point of parallelization is shared and thus it doesn't
> use extra memory (at least on modern OSes that support COW fork). The
> only extra RAM will be whatever is allocated later for the
> computation that is run in parallel.
Yes, you are right: unlike snow multicore does not need copies of the
same data.

However, in practice, the stuff I parallelize explicitly are often
bootstrap or similar calculations, so I do need more RAM because each
thread uses its own resampled data set. Which of course is not

>> Multicore doesn't make use of the implicit parallelization of the
>> BLAS.
> Actually, it does:
I get

> m <-matrix (1:9e6, 3e3)
> system.time(lapply(1:4, function(i) sum(tcrossprod(m^i))))
       User      System verstrichen
     13.751       2.570       4.527
and see 6 cores working.

with multicore:
> multicore:::detectCores ()
[1] 12

Firt try: mc.cores = 2, as 2 x 6 = 12:
> system.time(mclapply(1:4, function(i) sum(tcrossprod(m^i)), mc.cores = 2))

Timing stopped at: 123.457 266.559 195.029

without mc.cores, in case that screwed up something:
> system.time(mclapply(1:4, function(i) sum(tcrossprod(m^i))))

Timing stopped at: 2569.413 5758.595 2075.161
I see 4 cores working at 100 %

I do have the problem that I always need to execute
system(sprintf('taskset -p 0xffffffff %d', Sys.getpid()))
at the beginning of the R session. With snow, I execute that on the
nodes as well, but with multicore I don't now how to do that.

So probably the configuration is really messed up...

> user  system elapsed 10.136   0.568   0.664
> However, you really want to control the interplay of the explicit and
> implicit parallelization. This is where the parallel package comes
> into play (and why it includes multicore) so that for the explicit +
> R-implicit parallelization (not BLAS, though) we can control the
> maximal load (and RNG).

> sessionInfo ()
R version 2.14.1 (2011-12-22)
Platform: x86_64-redhat-linux-gnu (64-bit)

 [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] multicore_0.1-7

loaded via a namespace (and not attached):
[1] tools_2.14.1


> Cheers, Simon
>> But it is easier to use than snow: no cluster set up required, no
>> hassle with exporting all variables, etc. So, if the function
>> anyways doesn't have any implicit parallelization, I just change
>> lapply to mclapply, and that's it.
>> Best,
>> Claudia
>> _______________________________________________ R-sig-hpc mailing
>> list R-sig-hpc at r-project.org 
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc

More information about the R-sig-hpc mailing list