[R-sig-hpc] Matrix multiplication
Claudia Beleites
claudia.beleites at ipht-jena.de
Thu Mar 15 18:41:32 CET 2012
Simon and Paul,
seems I have trouble with some part of the configuration on the server:
I'm not able any longer to change the number of threads for the
gotoblas, it always stays at 6 (which is fortunately a quite sensible
number).
So, before believing what I wrote yesterday, please try yourself.
>> node). However, snow (and multicore) need more RAM
>
> Snow does but not multicore - the benefit of multicore is that all
> data at the point of parallelization is shared and thus it doesn't
> use extra memory (at least on modern OSes that support COW fork). The
> only extra RAM will be whatever is allocated later for the
> computation that is run in parallel.
Yes, you are right: unlike snow multicore does not need copies of the
same data.
However, in practice, the stuff I parallelize explicitly are often
bootstrap or similar calculations, so I do need more RAM because each
thread uses its own resampled data set. Which of course is not
>> Multicore doesn't make use of the implicit parallelization of the
>> BLAS.
>
> Actually, it does:
I get
> m <-matrix (1:9e6, 3e3)
> system.time(lapply(1:4, function(i) sum(tcrossprod(m^i))))
User System verstrichen
13.751 2.570 4.527
and see 6 cores working.
with multicore:
> multicore:::detectCores ()
[1] 12
Firt try: mc.cores = 2, as 2 x 6 = 12:
> system.time(mclapply(1:4, function(i) sum(tcrossprod(m^i)), mc.cores = 2))
Timing stopped at: 123.457 266.559 195.029
without mc.cores, in case that screwed up something:
> system.time(mclapply(1:4, function(i) sum(tcrossprod(m^i))))
Timing stopped at: 2569.413 5758.595 2075.161
I see 4 cores working at 100 %
I do have the problem that I always need to execute
system(sprintf('taskset -p 0xffffffff %d', Sys.getpid()))
at the beginning of the R session. With snow, I execute that on the
nodes as well, but with multicore I don't now how to do that.
So probably the configuration is really messed up...
> user system elapsed 10.136 0.568 0.664
>
> However, you really want to control the interplay of the explicit and
> implicit parallelization. This is where the parallel package comes
> into play (and why it includes multicore) so that for the explicit +
> R-implicit parallelization (not BLAS, though) we can control the
> maximal load (and RNG).
> sessionInfo ()
R version 2.14.1 (2011-12-22)
Platform: x86_64-redhat-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C
LC_TIME=de_DE.UTF-8 LC_COLLATE=de_DE.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8 LC_PAPER=C
LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] multicore_0.1-7
loaded via a namespace (and not attached):
[1] tools_2.14.1
Best,
Claudia
>
> Cheers, Simon
>
>
>> But it is easier to use than snow: no cluster set up required, no
>> hassle with exporting all variables, etc. So, if the function
>> anyways doesn't have any implicit parallelization, I just change
>> lapply to mclapply, and that's it.
>>
>> Best,
>>
>> Claudia
>>
>> _______________________________________________ R-sig-hpc mailing
>> list R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>
>>
More information about the R-sig-hpc
mailing list