[R] (performance) time in Windows vs Linux

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jun 29 09:59:24 CEST 2009


On Mon, 29 Jun 2009, Raymond Wan wrote:

> milton ruser wrote:

>> In fact I have a quadcore. But how can I know if Linux are really 
>> using only one core, and how can I setup it to use the 4cores?
>
> I don't know the answer in the context of R -- I didn't know that R can use 
> multiple cores by default?

It cannot, and much of this thread is pure speculation.  So let's try 
to set the record straight (as we have already done in the manuals).

The only way that a single R process will be using more than one CPU 
is if you have added a mulithreaded BLAS (and I've never heard of one 
being used successfully with R for Windows) or other add-on such as 
Luke Tierney's pmath[0] packages.  Packages such as snow and multicore 
run multiple R processes.

I do run a multithreaded BLAS on my 8-core Linux box and do often see 
'top' well over 100% -- I just tested and saw 798.9%.

It is exceptional to see R under Windows running faster than a 
well-tuned R under Linux on the same hardware (and my only Windows 
machine is a multiboot that normally runs Linux, so I do have 
extensive experience).  There are a number of reasons

- R for Windows always uses a shared library, whereas under Linux by 
default it does not, for speed -- see the R-admin manual.

- MinGW until recently had only an older compiler, 4.2.1. (gcc 4.4.0 
for mingw is just out, but I have not tried it).  gcc 4.3.x has both 
better general optimizations and better support for the Core 2 Duo my 
machine has.

- You can tune the Linux version better by compiling yourself 
(although some tuning is possible on Windows).

- Linux uses interrupts for things that Windows polls (or for some 
instances R does on those platforms).  That includes the overhead on 
Windows of running Rgui (if you are using that rather than Rterm) and 
polling the Windows message system.

- 32-bit Linux allows access to more address space than 32-bit 
Windows, so there may be less frequent garbage collections on large 
tasks.  In any case, the Linux memory manager is more efficient.

Against that, a 64-bit build will in general be slower than a 32-bit 
one -- see the R-admin manual.  If you run 32-bit R for Windows on 
64-bit Windows you are running under a WOW subsystem and that has a 
small overhead: but in our tests the REvolution 64-bit build of R was 
slightly slower.

But we are only talking about small differences, say up to 20% and 
usually more like 5-10%.

It is usually possible to find some task that a particular compiler 
optimizes badly, so there will be rare exceptions.

> But in general, I use "htop", whose man pages 
> describes it as:  "This program is a free (GPL) ncurses-based process 
> viewer."
>
> It is a colored version of "top", essentially.  At the top of the screen, you 
> will see your 4 cores represented as percentages.  Under Setup, add 
> "Processor" to the list of options and then "CPU" will appear as a column, 
> which if you have 4 cores, the values will vary from 1 to 4.
>
> If you want to check if R is running on more than one core, then obviously R 
> should appear more than once and with two different values under CPU.

Not so: that will happen if multiple copies of R are running, not if a 
single copy of R is running multiple threads.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list