[R-SIG-Mac] Nehalem performance [Was: Is R more heavy on memory or processor?]

Simon Urbanek simon.urbanek at r-project.org
Thu Aug 27 16:36:25 CEST 2009


It's been a while since this thread, but I could finally lay hands on  
a Nehalem to test, so I thought I'll share some tidbits.

My worries turned out to be true and Nehalems are not as fast with R  
as the synthetic benchmarks would make you think. The results are a  
mixed bag, but in overall R benchmarks a 2.26GHz 8-core Nehalem is  
somewhat slower (MASS 16.89s, R bench 20.4s) than a 2.66GHz quad-core  
Harpertown (MASS 18.8s, R bench 16.8s) and considerably slower than  
2.8GHZ 8-core Harpertown (MASS 16.04s, R bench 15.2s). This is mainly  
due to the fast that most tasks cannon use the excessive cores and  
individual cores are running at lower frequencies.

Some partial results are startling - for example "2800x2800 cross- 
product matrix (b = a' * a)" test literally kills Nehalems (0.64s for  
2.8GHz and 1.02s for 2.66GHz Harpertown and 4.67s(!!) for the  
Nehalem). There are outliers in the other direction as well, but not  
as surprising: "FFT over 2,400,000 random values" is 0.64s for Nehalem  
and 0.93s/1.8s for Harpertowns.

The tests used are from http://r.research.att.com/benchmarks
(R 2.9.2 release was used but admittedly the OS version varied)
Clearly, benchmarks never tell the full story and there may be uses  
that take advantage of the one or another architecture, but the bottom  
line is that Nehalems are not universally faster, so don't throuw your  
Harpertowns out just yet ;).

I expect a 2.66GHz Nehalem to test soon and will post the update.

If you have any experiences to share, please feel free to chip in.

Cheers,
Simon

PS: I have also looked at 32-bit vs 64-bit and it's an equally mixed  
bag - some tasks are considerably slower in 64-bit others considerably  
faster, so no clear answer there, either.

On Mar 24, 2009, at 16:01 , Simon Urbanek wrote:

>
> On Mar 24, 2009, at 14:55 , Booman, M wrote:
>
>> Dear all,
>>
>> I am going to purchase a Power Mac (a new one, with Nehalem  
>> processor) for my R-based microarray analyses. I use mainly  
>> Bioconductor packages, and a typical dataset would consist of 50  
>> microarrays with 40,000 datapoints each. To make the right choice  
>> of processor and memory, I have a few questions:
>>
>
> I don't use BioC [you may want to ask on the BioC list instead (or  
> hopefully some BioC users will chip in)], so my recommendations may  
> be based on slightly different problems.
>
>
>> - would the current version of R benefit from the 8 cores in the  
>> new Intel Xeon Nehalem 8-core Mac Pro? So would an 8-core 2.26GHz  
>> machine be better than a 4-core 2.93GHz?
>
> Unfortunately I cannot comment on Nehalems, but in general with  
> Xeons you do feel quite a difference in the clock speed, so I  
> wouldn't trade 2.93GHz for 2.26GHz regardless of the CPU generation.  
> It is true that pre-Nehalem Mac Pros cannot feed 8 cores, so you  
> want go for the new Mac Pros, but I wouldn't even think about the  
> 2.26GHz option. Some benchmarks suggest that the 2.26 Nehalem can  
> still compete favorably if a lot of memory/io is involved, but it  
> was not very convincing and I cannot tell first hand.
>
>
>> Or can R only use one core (in which case the 4-core 2.93GHZ  
>> machine would be better)?
>>
>
> R can use multiple cores in many ways - through BLAS (default in R  
> for Mac OS X), vector op parallelization (Luke's pnmath) or explicit  
> parallelization such as forking (multicore) or parallel processes  
> (snow). The amount of parallelization achievable depends heavily on  
> your applications. I use routinely all cores, but then I'm usually  
> modeling my problems that way.
>
>
>> - If R does not benefot from multiple cores yet, is there anything  
>> known about whether Snow Leopard might make a difference in this?
>>
>
> I cannot comment on ongoing work details due to DNA associated with  
> Snow Leopard, but technically from the Apple announcements you can  
> deduce that the only possible improvements directly related to R can  
> be achieved in the implicit parallelization which is essentially the  
> pnmath path. There is not much more you can do in R save for a re- 
> write of the methods you want to deal with.
>
> In fact, the hope is rather that the packages for R start using  
> parallelization more effectively, but that's not something Snow  
> Leopard alone can change.
>
>
>> - To determine if my first priority should be processor speed or  
>> RAM, on which does R rely more heavily?
>>
>
> In my line of work (which is not bioinf, though) RAM turned to be  
> more important, because the drop off when you run out of memory is  
> sudden and devastatingly huge. With CPUs you'll have to wait a bit  
> longer, but the difference is directly proportional to the CPU speed  
> you get, so it is never as bad as running out of wired RAM. (BTW: in  
> general you don't want to buy RAM from Apple - as much as I like  
> Apple, there are compatible RAM sets at a fraction of the cost of  
> what Apple charges, especially for Mac Pros - but there is always  
> the 1st generation issue *).
>
>
>> - The new chipset has 3 memory channels (forgive me if I word this  
>> wrong, as you may have noticed I am no computer tech) so it can  
>> read 6Gb RAM faster than it can read 8Gb of RAM; so for a program  
>> that relies more on RAM speed than RAM quantity it is recommended  
>> to use 6Gb instead of 8 for better performance (or any multiple of  
>> 3). Which is more important for R, RAM speed or RAM quantity?
>>
>
> 6GB is very little RAM, so I don't think that's an option ;) - but  
> yes, you should care about the size first. The channels and timings  
> only define how you populate the slots. Note that the 4-core Nehalem  
> has only 4 slots, so it's not very expandable - I'd definitely get a  
> 8-core old one with 16GB RAM or more rather than something that can  
> take only 8GB ...
>
>
>> (I am not sure if it helps to know, but previously I used a  
>> Powermac G5 quadcore (sadly I forgot which processor speed but it  
>> was the standard G5 quadcore) with 4 Gb RAM for datasets of 30-40  
>> microarrays of 18,000 datapoints each, and analysis was OK except  
>> for some memory errors in a script that used permutation analysis;  
>> but it wasn't very fast.)
>>
>
> I would keep an eye on the RAM expansibility - even if you buy less  
> RAM now, a ceiling of 8GB is very low. It may turn out that larger  
> DIMMs will become available, but 16GB for the future is not enough,  
> either. As with all 1st generation products the prices will go down  
> a lot over time, so you may plan to upgrade later. Another piece  
> worth considering is that you can always update RAM easily, but CPU  
> upgrade is much more difficult.
>
> Cheers,
> Simon
>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>
>



More information about the R-SIG-Mac mailing list