[R-pkg-devel] multithreading in packages

Ivan Krylov kry|ov@r00t @end|ng |rom gm@||@com
Sat Oct 9 08:52:10 CEST 2021


В Thu, 7 Oct 2021 21:58:08 -0400 (EDT)
Vladimir Dergachev <volodya using mindspring.com> пишет:

>    * My understanding from reading documentation and source code is
> that there is no dedicated support in R yet, but there are packages
> that use multithreading. Are there any plans for multithreading
> support in future R versions ?

Shared memory multithreading is hard to get right in a memory-safe
language (e.g. R), but there's the parallel package, which is a part of
base R, which offers process-based parallelism and may run your code on
multiple machines at the same time. There's no communication _between_
these machines, though. (But I think there's an MPI package on CRAN.)

>    * pthread or openmp ? I am particularly concerned about
> interaction with other packages. I have seen that using pthread and
> openmp libraries simultaneously can result in incorrectly pinned
> threads.

pthreads-based code could be harder to run on Windows (which is a
first-class platform for R, expected to be supported by most packages).
OpenMP should be cross-platform, but Apple compilers are sometimes
lacking; the latest Apple likely has been solved since I've heard about
it. If your problem can be made embarrassingly parallel, you're welcome
to use the parallel package.

>    * control of maximum number of threads. One can default to openmp 
> environment variable, but these might vary between openmp
> implementations.

Moreover, CRAN-facing tests aren't allowed to consume more than 200%
CPU, so it's a good idea to leave the number of workers in control of
the user. According to a reference guide I got from openmp.org, OpenMP
implementations are expected to understand omp_set_num_threads() and
the OMP_NUM_THREADS environment variable.

-- 
Best regards,
Ivan



More information about the R-package-devel mailing list