[Rd] How to safely using OpenMP pragma inside a .C() function?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu Sep 1 08:34:12 CEST 2011
Note that currently R internals do not actually use multiple threads
in OpenMP, and there is no documented way to make them do so.
The main issue is that there is insufficient knowlege of where they
are worthwhile (which is both OS and platform-dependent: we don't even
have reliable cross-platform ways to decide a reasonable number of
threads, and the number of virtual cores on a multi-user platform
definitely is not reasonable). Luke Tierney reported that the
crossover point for a speed-up on Mac OS X was much larger matrices
than on Linux, for example, and there is currently no OpenMP support
in the Windows toolchain.
The current implementation is a trial: there are more places planned
to use OpenMP as and when the uncertainties are resolved.
This will change at some point: given the current instability in
thread support in the MinGW-w64 project this may or may not be before
R 2.14.0.
On Wed, 31 Aug 2011, Simon Urbanek wrote:
> Pawel,
>
> On Aug 31, 2011, at 4:46 PM, pawelm wrote:
>
>> I just found this (performance improvement of the "dist" function when using
>> openmp):
You failed to describe the platform! See the posting guide (which
asked you to do so 'at a minimum').
>> .Internal(setMaxNumMathThreads(1)); .Internal(setNumMathThreads(1)); m <-
>> matrix(rnorm(810000),900,900); system.time(d <- dist(m))
>>
>> user system elapsed
>> 3.510 0.013 3.524
>>
>> .Internal(setMaxNumMathThreads(5)); .Internal(setNumMathThreads(5)); m <-
>> matrix(rnorm(810000),900,900); system.time(d <- dist(m));
>>
>> user system elapsed
>> 3.536 0.007 1.321
>>
>> Works great! Just the question stays if it's a good practice to use
>> "R_num_math_threads" in external packages?
Most definitely not: it is never good practice to use undocumented
non-API variables. See 'Writing R Extensions'.
> Normally you don't need to mess with all this and I would recommend
> not to do so. The R internals use a different strategy since they
> need to cope with the fall-back case, but packages should not worry
> about that. The default number of threads is defined by the
> OMP_NUM_THREADS environment variable and that is the documented way
> in OpenMP, so my recommendation would be to not mess with
> num_threads() which is precisely why I did not use it in the example
> I gave you.
I'd be cautious there. OMP_NUM_THREADS affects all the OpenMP code in
the R session, and possibly others which use it (some parallel BLAS do
too).
>
> That said, R-devel has new facilities for parallelization so things
> may change in the future.
>
> Cheers,
> Simon
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list