[R] Parallel R
Luke Tierney
luke at stat.uiowa.edu
Thu Jul 10 20:01:08 CEST 2008
pnmath currently uses up to 8 threads (i.e. 1, 2, 4, or 8).
getNumPnmathThreads() should tell you the maximum number used on your
system, which should be 8 if the number of processors is being
identified correctly. With the size of m this calculation should be
using 8 threads, but the exp calculation is fairly fast, so the
overhead is noticable. On a Linux box with 4 dual-core AMD processors
I get
> m <- matrix(0, 10000, 1000)
> mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
[1] 0.3859
> library(pnmath)
> mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
[1] 0.0775
A similar example using qbeta, a slower function, gives
> p <- matrix(0.5,1000,1000)
> setNumPnmathThreads(1)
[1] 1
> mean(replicate(10, system.time(qbeta(p,2,3), gcFirst=TRUE))["elapsed",])
[1] 7.334
> setNumPnmathThreads(8)
[1] 8
> mean(replicate(10, system.time(qbeta(p,2,3), gcFirst=TRUE))["elapsed",])
[1] 0.9576
On an 8-core Intel/OS X box the improvement for exp is much less, but
is similar for qbeta.
luke
On Thu, 10 Jul 2008, Martin Morgan wrote:
> "Juan Pablo Romero Méndez" <jpablo.romero at gmail.com> writes:
>
>> Just out of curiosity, what system do you have?
>>
>> These are the results in my machine:
>>
>>> system.time(exp(m), gcFirst=TRUE)
>> user system elapsed
>> 0.52 0.04 0.56
>>> library(pnmath)
>>> system.time(exp(m), gcFirst=TRUE)
>> user system elapsed
>> 0.660 0.016 0.175
>>
>
> from cat /proc/cpuinfo, the original results were from a 32 bit
> dual-core system
>
> model name : Intel(R) Core(TM)2 CPU T7600 @ 2.33GHz
>
> Here's a second set of results on a 64-bit system with 16 core (4 core
> on 4 physical processors, I think)
>
>> mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
> [1] 0.165
>> mean(replicate(10, system.time(exp(m), gcFirst=TRUE))["elapsed",])
> [1] 0.0397
>
> model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz
>
> One thing is that for me in single-thread mode the faster processor
> actually evaluates slower. This could be because of 64-bit issues,
> other hardware design aspects, the way I've compiled R on the two
> platforms, or other system activities on the larger machine.
>
> A second thing is that it appears that the larger machine only
> accelerates 4-fold, rather than a naive 16-fold; I think this is from
> decisions in the pnmath code about the number of processors to use,
> although I'm not sure.
>
> A final thing is that running intensive tests on my laptop generates
> enough extra heat to increase the fan speed and laptop temperature. I
> sort of wonder whether consumer laptops / desktops are engineered for
> sustained use of their multiple core (although I guess the gaming
> community makes heavy use of multiple cores).
>
> Martin
>
>
>
>> Juan Pablo
>>
>>
>>>
>>>> system.time(exp(m), gcFirst=TRUE)
>>> user system elapsed
>>> 0.108 0.000 0.106
>>>> library(pnmath)
>>>> system.time(exp(m), gcFirst=TRUE)
>>> user system elapsed
>>> 0.096 0.004 0.052
>>>
>>> (elapsed time about 2x faster). Both BLAS and pnmath make much better
>>> use of resources, since they do not require multiple R instances.
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: luke at stat.uiowa.edu
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
More information about the R-help
mailing list