[Rd] How to safely using OpenMP pragma inside a .C() function?

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Sep 1 08:34:12 CEST 2011


Note that currently R internals do not actually use multiple threads 
in OpenMP, and there is no documented way to make them do so.

The main issue is that there is insufficient knowlege of where they 
are worthwhile (which is both OS and platform-dependent: we don't even 
have reliable cross-platform ways to decide a reasonable number of 
threads, and the number of virtual cores on a multi-user platform 
definitely is not reasonable).  Luke Tierney reported that the 
crossover point for a speed-up on Mac OS X was much larger matrices 
than on Linux, for example, and there is currently no OpenMP support 
in the Windows toolchain.

The current implementation is a trial: there are more places planned 
to use OpenMP as and when the uncertainties are resolved.

This will change at some point: given the current instability in 
thread support in the MinGW-w64 project this may or may not be before 
R 2.14.0.

On Wed, 31 Aug 2011, Simon Urbanek wrote:

> Pawel,
>
> On Aug 31, 2011, at 4:46 PM, pawelm wrote:
>
>> I just found this (performance improvement of the "dist" function when using
>> openmp):

You failed to describe the platform!  See the posting guide (which 
asked you to do so 'at a minimum').

>> .Internal(setMaxNumMathThreads(1)); .Internal(setNumMathThreads(1)); m <-
>> matrix(rnorm(810000),900,900); system.time(d <- dist(m))
>>
>>  user  system elapsed
>>  3.510   0.013   3.524
>>
>> .Internal(setMaxNumMathThreads(5)); .Internal(setNumMathThreads(5)); m <-
>> matrix(rnorm(810000),900,900); system.time(d <- dist(m));
>>
>>   user  system elapsed
>>  3.536   0.007   1.321
>>
>> Works great! Just the question stays if it's a good practice to use
>> "R_num_math_threads" in external packages?

Most definitely not: it is never good practice to use undocumented 
non-API variables. See 'Writing R Extensions'.

> Normally you don't need to mess with all this and I would recommend 
> not to do so. The R internals use a different strategy since they 
> need to cope with the fall-back case, but packages should not worry 
> about that. The default number of threads is defined by the 
> OMP_NUM_THREADS environment variable and that is the documented way 
> in OpenMP, so my recommendation would be to not mess with 
> num_threads() which is precisely why I did not use it in the example 
> I gave you.

I'd be cautious there.  OMP_NUM_THREADS affects all the OpenMP code in 
the R session, and possibly others which use it (some parallel BLAS do 
too).

>
> That said, R-devel has new facilities for parallelization so things 
> may change in the future.
>
> Cheers,
> Simon

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list