[R-SIG-Mac] Perplexed benchmark result from a new Macbook Pro Core i5
Yan Zhou
zhouyan1014 at gmail.com
Wed May 12 14:44:08 CEST 2010
My best guess of the BLAS is that vecLib is not optimised for i5 and i7. There is no point to optimise it for some "future products", so it can hardly been optimisation before the release of new MBP. And after the release, there are surely no updates yet. This is just my guess.
On 10 May 2010, at 17:29, Gardar Johannesson wrote:
> To clarify, both Macbook Pro tests were carried out on OS X 11.6.3. Regarding memory speed, the new laptop uses 1067GHz DDR3, while the older one uses 667GHz DDR2. In short, the 2008 laptop has both slower and older CPU and slower memory, but I think it has slightly larger cache memory (4Mb versus 3Mb---but I think there is more to it).
>
> I did a little linear regression (lm()) test for a 100000x100 dimension matrix. In this case, the Core i5 finished at ~3.2sec while the Core 2 Duo finished at ~5.7sec. So that was good news. But I have not been able to explain the BLAS performance, which I also carried out with 500 and 5000 dimensional matrices with the same results (i.e. Core 2 Duo ahead of Core i5).
>
> I guess there is no simple explanation for this.
>
> Thanks for looking in to this,
> Gardar
>
> On May 10, 2010, at 6:20 AM, Simon Urbanek wrote:
>
>>
>> On May 8, 2010, at 2:53 PM, Gardar Johannesson wrote:
>>
>>> I was just replacing a Macbook Pro from 2008 (with a 2.2GHz Intel Core 2 Duo) with a new Macbook Pro (with a 2.4GHz Intel Core i5). To get a rough idea about the difference in R execution speed I ran a small test, with the output shown below. In short:
>>>
>>> 1) The new Macbook Pro was ca 60% _slower_ at linear algebra (crossprod() and solve())
>>> 2) The new Macbook Pro was ca 17% faster on a long for-loop
>>> 3) Linking against Goto2 versus vecLib improved the linear algebra results slightly
>>>
>>> Both test were done using the same 2.11.0 dmg image from CRAN.
>>>
>>> Any thoughts on this?
>>>
>>
>> What OS X versions are on the respective machines? The vecLib performance varies greatly with their versions.
>>
>>
>>> Any ideas how I can improve the performance results? What about compiling from source?
>>>
>>
>> Note that you are essentially just comparing the BLAS libraries on each machine, R is practically not involved in this at all, so if you meant R in the "compiling from source" then the answer is likely no (R speed is what you see in the loops).
>>
>>
>> I don't have an i5 arounds, but comparing similar architectures (Penryn vs Nehalem) gives a slight edge to Nehalem (3.8s @ 2.8GHz vs 3.4s @ 2.66Ghz on solve(B)) but that is for a Xeon so the memory speed may be the edge (everything on OS X 10.6.3).
>>
>> Cheers,
>> Simon
>>
>>
>>>
>>> Thanks,
>>> Gardar Johannesson
>>>
>>>
>>> ###########################################
>>> ## Results from new macbook pro (Core i5 @ 2.4Ghz)
>>>
>>>> set.seed(1)
>>>> A <- matrix(rnorm(2000*2000),2000,2000)
>>>> system.time(B <- crossprod(A))
>>> user system elapsed
>>> 2.500 0.058 0.816
>>>> system.time(B <- crossprod(A))
>>> user system elapsed
>>> 2.502 0.050 0.814
>>>> system.time(solve(B))
>>> user system elapsed
>>> 7.208 0.265 2.740
>>>> system.time(solve(B))
>>> user system elapsed
>>> 7.121 0.264 2.666
>>>> system.time({a <- rep(1.0,100); for(i in 1:1e6) a <- 1.0*a+0.0})
>>> user system elapsed
>>> 2.964 0.602 3.528
>>>> system.time({a <- rep(1.0,100); for(i in 1:1e6) a <- 1.0*a+0.0})
>>> user system elapsed
>>> 3.040 0.732 3.732
>>>> sessionInfo()
>>> R version 2.11.0 (2010-04-22)
>>> i386-apple-darwin9.8.0
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_2.11.0
>>>>
>>>
>>> ###################################################
>>> ## Results from old macbook pro (Core 2 Duo @ 2.2GHz)
>>>
>>>> set.seed(1)
>>>> A <- matrix(rnorm(2000*2000),2000,2000)
>>>> system.time(B <- crossprod(A))
>>> user system elapsed
>>> 1.429 0.073 0.800
>>>> system.time(B <- crossprod(A))
>>> user system elapsed
>>> 1.429 0.064 0.874
>>>> system.time(solve(B))
>>> user system elapsed
>>> 4.532 0.285 2.860
>>>> system.time(solve(B))
>>> user system elapsed
>>> 4.521 0.281 2.834
>>>> system.time({a <- rep(1.0,100); for(i in 1:1e6) a <- 1.0*a+0.0})
>>> user system elapsed
>>> 3.501 0.764 4.215
>>>> system.time({a <- rep(1.0,100); for(i in 1:1e6) a <- 1.0*a+0.0})
>>> user system elapsed
>>> 3.459 0.702 4.113
>>>> sessionInfo()
>>> R version 2.11.0 (2010-04-22)
>>> i386-apple-darwin9.8.0
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>>
>>>
>>>
>>> ###################################################
>>> ## Results from new macbook pro (Core i5 @ 2.4Ghz)
>>> ## Linking against Goto2 BLAS (vs vecLib)
>>>
>>>> set.seed(1)
>>>> A <- matrix(rnorm(2000*2000),2000,2000)
>>>> system.time(B <- crossprod(A))
>>> user system elapsed
>>> 2.348 0.124 0.635
>>>> system.time(B <- crossprod(A))
>>> user system elapsed
>>> 2.342 0.110 0.622
>>>> system.time(solve(B))
>>> user system elapsed
>>> 6.634 0.327 2.158
>>>> system.time(solve(B))
>>> user system elapsed
>>> 6.697 0.348 2.034
>>>> system.time({a <- rep(1.0,100); for(i in 1:1e6) a <- 1.0*a+0.0})
>>> user system elapsed
>>> 2.577 0.548 2.885
>>>> system.time({a <- rep(1.0,100); for(i in 1:1e6) a <- 1.0*a+0.0})
>>> user system elapsed
>>> 2.411 0.478 2.859
>>>> sessionInfo()
>>> R version 2.11.0 (2010-04-22)
>>> i386-apple-darwin9.8.0
>>>
>>> locale:
>>> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>>>
>>> _______________________________________________
>>> R-SIG-Mac mailing list
>>> R-SIG-Mac at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>>>
>>>
>>
>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
More information about the R-SIG-Mac
mailing list