[R] (performance) time in Windows vs Linux

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jun 29 10:13:03 CEST 2009


I meant to write "not so for 'top'" in the final para.

On Mon, 29 Jun 2009, Prof Brian Ripley wrote:

> On Mon, 29 Jun 2009, Raymond Wan wrote:
>
>> milton ruser wrote:
>
>>> In fact I have a quadcore. But how can I know if Linux are really using 
>>> only one core, and how can I setup it to use the 4cores?
>> 
>> I don't know the answer in the context of R -- I didn't know that R can use 
>> multiple cores by default?
>
> It cannot, and much of this thread is pure speculation.  So let's try to set 
> the record straight (as we have already done in the manuals).
>
> The only way that a single R process will be using more than one CPU is if 
> you have added a mulithreaded BLAS (and I've never heard of one being used 
> successfully with R for Windows) or other add-on such as Luke Tierney's 
> pmath[0] packages.  Packages such as snow and multicore run multiple R 
> processes.
>
> I do run a multithreaded BLAS on my 8-core Linux box and do often see 'top' 
> well over 100% -- I just tested and saw 798.9%.
>
> It is exceptional to see R under Windows running faster than a well-tuned R 
> under Linux on the same hardware (and my only Windows machine is a multiboot 
> that normally runs Linux, so I do have extensive experience).  There are a 
> number of reasons
>
> - R for Windows always uses a shared library, whereas under Linux by default 
> it does not, for speed -- see the R-admin manual.
>
> - MinGW until recently had only an older compiler, 4.2.1. (gcc 4.4.0 for 
> mingw is just out, but I have not tried it).  gcc 4.3.x has both better 
> general optimizations and better support for the Core 2 Duo my machine has.
>
> - You can tune the Linux version better by compiling yourself (although some 
> tuning is possible on Windows).
>
> - Linux uses interrupts for things that Windows polls (or for some instances 
> R does on those platforms).  That includes the overhead on Windows of running 
> Rgui (if you are using that rather than Rterm) and polling the Windows 
> message system.
>
> - 32-bit Linux allows access to more address space than 32-bit Windows, so 
> there may be less frequent garbage collections on large tasks.  In any case, 
> the Linux memory manager is more efficient.
>
> Against that, a 64-bit build will in general be slower than a 32-bit one -- 
> see the R-admin manual.  If you run 32-bit R for Windows on 64-bit Windows 
> you are running under a WOW subsystem and that has a small overhead: but in 
> our tests the REvolution 64-bit build of R was slightly slower.
>
> But we are only talking about small differences, say up to 20% and usually 
> more like 5-10%.
>
> It is usually possible to find some task that a particular compiler optimizes 
> badly, so there will be rare exceptions.
>
>> But in general, I use "htop", whose man pages describes it as:  "This 
>> program is a free (GPL) ncurses-based process viewer."
>> 
>> It is a colored version of "top", essentially.  At the top of the screen, 
>> you will see your 4 cores represented as percentages.  Under Setup, add 
>> "Processor" to the list of options and then "CPU" will appear as a column, 
>> which if you have 4 cores, the values will vary from 1 to 4.
>> 
>> If you want to check if R is running on more than one core, then obviously 
>> R should appear more than once and with two different values under CPU.
>
> Not so: that will happen if multiple copies of R are running, not if a single 
> copy of R is running multiple threads.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list