[Rd] Some timings for 64 bit Opteron (ATLAS, GOTO, std)
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Sat Mar 6 13:10:54 MET 2004
Martin Maechler <maechler at stat.math.ethz.ch> writes:
> ## gives
> ## ATLAS GOTO std
> ## boot-Ex 73.38 73.71 73.62
> ## nlme-Ex 31.92 34.18 31.91
> ## mgcv-Ex 29.20 31.69 29.35
> ## MASS-Ex 21.54 20.49 20.29
> ## stats-Ex 17.80 17.69 17.91
> ## lattice-Ex 11.38 11.37 11.05
> ## methods-Ex 6.87 6.53 6.58
> ## base-Ex 5.48 5.28 5.26
> ## graphics-Ex 4.71 4.73 4.70
> ## tools-Ex 3.86 3.66 3.82
> ## cluster-Ex 3.78 3.74 3.65
> ## utils-Ex 2.73 2.60 2.60
> ## p-r-random-tests 2.60 2.58 2.55
> ## survival-Ex 2.48 2.49 2.30
> ## ...
> ## .........
OK, I got around to check this on the Opteron240 system and got just
about the same + 50% which is expectable given the relative CPU
speeds:
ATLAS GOTO std
boot-Ex 107.63 115.68 105.55
nlme-Ex 55.00 55.28 48.73
mgcv-Ex 36.45 43.02 40.14
MASS-Ex 34.02 35.14 30.81
stats-Ex 27.44 28.12 27.76
lattice-Ex 18.16 19.06 19.05
methods-Ex 9.94 9.86 10.53
base-Ex 8.56 8.70 8.56
graphics-Ex 7.66 7.72 7.43
cluster-Ex 5.69 5.81 5.47
tools-Ex 4.76 4.57 4.81
utils-Ex 4.44 4.37 5.77
demos2 3.88 3.82 3.63
demos 3.71 3.73 3.46
survival-Ex 3.66 3.76 3.61
p-r-random-tests 3.47 3.50 3.47
...
(The system was supposedly idle, but KDE was running on the console so
maybe not quite... Also, the odd cron job may have passed by.)
So, basically the threaded and optimized BLAS's are NOPs for these
suites of standard tasks. The real teeth are not shown until you do
get to tasks which need hardcore numerics:
Plain, ATLAS, Goto in that order. Invert random 3000x3000 matrix
pd at linux:~/r-devel> for i in BUILD* ; do (cd $i ; time echo 'set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(solve(m))'|bin/R --vanilla -q) ; done
> set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(solve(m))
[1] 251.90 1.14 253.08 0.00 0.00
>
real 4m20.967s
user 4m19.431s
sys 0m1.537s
> set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(solve(m))
[1] 3.86 1.10 27.24 0.00 0.00
>
real 0m35.633s
user 0m53.442s
sys 0m1.711s
> set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(solve(m))
[1] 30.06 1.15 31.76 0.00 0.00
>
real 0m39.804s
user 0m42.220s
sys 0m1.621s
(Notice how system.time gets the CPU usage wrong in the threaded
cases, worst so for ATLAS. Presumably, it is only counting one process
and in the ATLAS case, one that is mostly idle.)
So for matrix inversion, ATLAS seems to be a little faster than Goto
(at the expense of a higher CPU utilization, mind you: the Goto
version appears to be running nearly single-threaded). For matrix
multiply, we have Goto as the fastest:
pd at linux:~/r-devel> for i in BUILD* ; do (cd $i ; time echo 'set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(m%*%m)'|bin/R --vanilla -q) ; done
> set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(m%*%m)
[1] 230.20 0.10 230.36 0.00 0.00
>
real 3m58.639s
user 3m57.857s
sys 0m0.455s
> set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(m%*%m)
[1] 0.34 0.01 16.49 0.00 0.00
>
real 0m25.253s
user 0m38.809s
sys 0m0.535s
> set.seed(1234);m<-matrix(rnorm(9e6),3e3);system.time(m%*%m)
[1] 12.94 0.08 13.06 0.00 0.00
>
real 0m21.629s
user 0m32.223s
sys 0m0.464s
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list