[R-sig-hpc] Matrix multiplication
pgilbert902 at gmail.com
Tue Mar 13 20:05:50 CET 2012
On 12-03-13 12:50 PM, Brian G. Peterson wrote:
> On Tue, 2012-03-13 at 12:40 -0400, Paul Gilbert wrote:
>> Thanks for spelling this out for those of us that are a bit slow.
>> (Newbie questions below)
> <... snip ...>
>>> So, if your BLAS does multithreaded matrix multiplication, it will use
>>> multiple threads 'implicitly', as Simon pointed out.
>> Is there an easy way to know if the R I am using has been compiled with
>> multi-thread BLAS support?
> BLAS should be 'plug and play', as R is usually compiled to use a shared
> object BLAS. As such, installing the BLAS on your machine (and
> appropriately configuring it) should 'just work' with te new BLAS when
> you restart R.
> Dirk et. al. wrote a paper, now a bit dated, that benchmarked some of
> the BLAS libraries, that should have some additional details.
(I have a long history of getting things that should 'just work' to
'just not work'.) But I didn't really state my question very well. I'm
really wondering about two related situations. How can I confirm after a
change to underlying system that R is using the new configuration, and
second, if I am running benchmarks in R is there an easy way to record
the underlying configuration that is being used.
>>> Be aware that there can be unintended negative interactions between
>>> implicit and explicit parallelization. On cluster nodes I tend to
>>> configure the BLAS to use only one thread to avoid resource contention
>>> when all cores are doing explicit parallelization.
>> How do you do this? Does it need to be done when you are compiling R, or
>> can it be done on the fly while running R processes?
> Some BLAS, like gotoblas, support an environment variable to change the
> number of cores to be used. This can be changed at run-time. Others,
> like the mkl, are always multithreaded. Others, like ATLAS, can be
> compiled in either single threaded or multi-threaded modes.
> So, for me, on my cluster nodes, I use a single threaded BLAS, assuming
> that *explicit* parallelization will be the primary driver of CPU load,
> and not wanting to over-commit the processor when 12 calculations each
> try to spawn 12 threads in the BLAS. On other machines, I might use a
> multithreaded BLAS like gotoblas so that I have some flexibility (though
> apparently unlike Claudia, I rarely change it in practice).
> - Brian
More information about the R-sig-hpc