[Rd] SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)

Norm Matloff matloff at cs.ucdavis.edu
Sun Dec 16 05:19:03 CET 2012


On Sat, Dec 15, 2012 at 10:58:34PM -0500, Simon Urbanek wrote:
> On Dec 15, 2012, at 7:38 PM, Norm Matloff wrote:
 
> > Even if one has the entire machine to oneself, there is often
> > another very good reason not to use the maximum number of cores:
> > Using the maximum number of cores may reduce performance.  This is
> > true in general, and sometimes especially true when the inferred
> > number of cores includes hyperthreading.
 
> Actually, the converse is often true (it depends on the machine
> architecture, though - I'm assuming true SMP machines here) -- often
> it is beneficial to run more threads than cores because the time spent
> waiting for access outside the CPU can be used by other thread that
> can continue computing. This is in particular true for parallel
> because of the setup overhead -- typically the real problem is memory,
> though. That said, the balance is heavily machine and task dependent
> so any default will be bad for some cases. Typically, for commodity
> machines with couple dozen cores it's good to overload, for bigger
> machines it's bad.

Yes, it sometimes is beneficial to run more threads than cores.  But I
"typically" is a rather risky term to use.  As usual, this is very
problem-dependent, and what is "typical" for one person may not be so
for another.  I would speculate, for instance, that most embarrassingly
parallel applications can benefit from some degree of oversubscription,
but even then I wouldn't go out on a limb.

At any rate, the main point for the OP is that there are performance
reasons not to set the number of threads/processors equal to the number
of cores.

Norm



More information about the R-devel mailing list