[Rd] Another issue using multi-processing linear algebra libraries
Rob Steele
rob@tee|e @end|ng |rom y@hoo@com
Tue Aug 6 16:19:25 CEST 2024
From the R Installation and Admin manual:
"There is a tendency for re-distributors of R to use ‘enhanced’ linear algebra libraries without explaining their downsides.”
There’s a downside not mentioned in the manual that caught and baffled me for a while. I was using all 64 cores of an AWS instance via parallel::mclapply() and doing matrix multiplications in the parallelized function. If the matrices were big enough the linked BLAS or LAPACK would try to use all 64 cores for each multiplication, which meant 64^2 processes or threads in some combination and that was the end of all useful work. I worked around the problem by rewriting the matrix multiply as “colSums(x * t(y))”. It also worked to build R from source, which I guess uses the built-in BLAS and LAPACK.
Would it make sense to add a parameter somewhere, to mclapply(), say, telling R to not use multiprocessing libraries? Does R even know whether a linked library is doing multi-processing? Does R build its own BLAS and LAPACK if its also linking external ones?
Thanks,
Rob
[[alternative HTML version deleted]]
More information about the R-devel
mailing list