[Rd] Peculiar timing result
Douglas Bates
bates at stat.wisc.edu
Fri Mar 3 19:26:32 CET 2006
I don't think this calculation is memory-bound at all and I would be
surprised if changing to a 32-bit environment would change things. I
do have a 32-bit chroot environment on these machines (needed for
things like wine and acroread) so I'll try that out but I think I will
need to use Atlas as the accelerated BLAS rather than Goto's BLAS. I
imagine the Opteron/Athlon 64 version of Goto's BLAS assumes a 64-bit
environment.
On 3/3/06, Paul Gilbert <pgilbert at bank-banque-canada.ca> wrote:
> Doug
>
> This is probably not your reason, but I am finding my dual core Athlon
> 64 is much slower running 64 bit Linux and R than it was running 32 bit
> Linux and R. All the programs are bigger. (Some, like the clock applet,
> are a lot bigger for no obvious reason.) The difference is enough to
> put my meager 1GB machine into swapping much more, with the result that
> it is a lot slower.
>
> Paul
>
> Douglas Bates wrote:
>
> >I have been timing a particular model fit using lmer on several
> >different computers and came up with a peculiar result - the model fit
> >is considerably slower on a dual-core Athlon 64 using Goto's
> >multithreaded BLAS than on a single-core processor.
> >
> >Here is the timing on a single-core Athlon 64 3000+ running under
> >today's R-devel with version 0.995-5 of the Matrix package.
> >
> >
> >
> >>library(Matrix)
> >>data(star, package = 'mlmRev')
> >>system.time(fm1 <- lmer(math~gr+sx+eth+cltype+(yrs|id)+(1|tch)+(yrs|sch), star, control = list(nit=0,grad=0,msV=1)))
> >>
> >>
> > 0 241720.: 1.16440 0.335239 0.00000 1.78732 0.867209 0.382318 0.00000
> > 1 239722.: 1.94952 5.00000e-10 0.00933767 1.65999 0.858003
> >0.341520 0.00908757
> > 2 239580.: 1.95924 0.0884059 0.00933767 1.65308 0.857487
> >0.339296 0.00954718
> > 3 239215.: 2.60877 0.0765848 0.0177699 1.45739 0.843562
> >0.275100 0.0236849
> > 4 239204.: 2.62582 0.106670 0.0239698 1.41976 0.841086
> >0.261033 0.0267073
> > 5 239176.: 2.63149 0.0787924 0.0367185 1.37952 0.838527
> >0.245076 0.0301134
> > 6 239141.: 2.64949 0.108534 0.0594586 1.28846 0.832543
> >0.208404 0.0375456
> > 7 239049.: 2.64794 0.0789214 0.121782 1.10436 0.819711
> >0.126101 0.0524965
> > 8 239004.: 2.66044 0.117957 0.181505 0.932202 0.798982
> >0.0718116 0.0628958
> > 9 238944.: 2.66310 0.0819653 0.334477 0.631735 0.740855
> >0.258359 0.0806590
> > 10 238893.: 2.72626 0.0975205 0.653432 0.703912 0.666067
> >0.109922 0.201809
> > 11 238892.: 2.74381 0.111146 0.666155 0.693544 0.662000 0.104060 0.207591
> > 12 238888.: 2.75052 0.0990238 0.689199 0.694588 0.655781
> >0.106516 0.216460
> > 13 238861.: 2.80325 0.126935 1.05912 0.733914 0.556162 0.159296 0.360938
> > 14 238832.: 2.82656 0.117617 1.59471 0.607916 0.441371
> >0.0749944 0.976142
> > 15 238811.: 2.87350 0.136332 1.59046 0.653141 0.353763 0.226061 1.79285
> > 16 238810.: 2.87663 0.125135 1.58992 0.656808 0.352605 0.220488 1.79282
> > 17 238806.: 2.89342 0.141551 1.58607 0.676523 0.344212 0.181833 1.79268
> > 18 238804.: 2.90080 0.125137 1.56624 0.682921 0.261295 0.180598 1.74325
> > 19 238802.: 2.91950 0.128569 1.56836 0.680436 0.336051 0.159940 1.80400
> > 20 238801.: 2.92795 0.134762 1.56597 0.685121 0.331695 0.145547 1.80414
> > 21 238801.: 2.93741 0.125667 1.56139 0.687827 0.332700 0.138854 1.81495
> > 22 238800.: 2.94588 0.131757 1.55294 0.687909 0.330414 0.137834 1.82826
> > 23 238799.: 2.96867 0.129716 1.52943 0.688678 0.323171 0.139912 1.84615
> > 24 238799.: 2.98994 0.133378 1.52188 0.700038 0.337387 0.131403 1.77623
> > 25 238799.: 3.00312 0.135308 1.51475 0.697550 0.311750 0.145683 1.78037
> > 26 238799.: 3.00461 0.129920 1.51083 0.697666 0.306722 0.138745 1.81188
> > 27 238799.: 3.00504 0.134882 1.50539 0.696745 0.302949 0.135897 1.84405
> > 28 238799.: 3.00422 0.134013 1.47947 0.698115 0.303243 0.133806 1.86486
> > 29 238799.: 3.00819 0.134378 1.48185 0.701871 0.307097 0.134637 1.84996
> > 30 238799.: 3.01313 0.134279 1.49098 0.702883 0.304788 0.133682 1.86254
> > 31 238799.: 3.01291 0.134253 1.49060 0.701818 0.303155 0.133771 1.84613
> > 32 238799.: 3.01142 0.134314 1.48921 0.701782 0.303589 0.134439 1.84653
> > 33 238799.: 3.01174 0.134315 1.48926 0.701641 0.304120 0.134145 1.84635
> > 34 238799.: 3.01175 0.134304 1.48942 0.701740 0.303762 0.134185 1.84649
> > 35 238799.: 3.01173 0.134307 1.48937 0.701724 0.303809 0.134206 1.84647
> >[1] 43.10 3.78 48.41 0.00 0.00
> >
> >
> >(If you run the timing yourself and don't want to see the iteration
> >output, take the msV=1 out of the control list. I keep it in there so
> >I can monitor the progress.)
> >
> >If I time the same model fit on a dual-core Athlon 64 X2 3800+ with
> >the same version of R, BLAS and Matrix package, the timing ends up
> >with something like
> >
> >90 140 235 0 0
> >
> >I do see that the multithreading is working on a calculation that is
> >essentially BLAS-bound such as
> >
> >
> >
> >>mm <- Matrix(rnorm(1e6), nc = 1e3)
> >>system.time(crossprod(mm))
> >>
> >>
> >[1] 0.57 0.02 0.60 0.00 0.00
> >
> >On the X2 processor it still takes about 0.6 seconds user time but
> >only 0.3 seconds elapsed time when the machine is otherwise idle and
> >both cores are available for the calculation.
> >
> >Any suggestions why the dual-core processor is so much slower than the
> >single core processor?
> >
> >By the way, I would be interested in accumulating timings of this
> >model fit on other systems. If you do time it please send me
> >(off-list) a summary of the version of R, version of the accelerated
> >BLAS if you use them, processor speed and configuration (i.e.
> >multiprocessor, multicore, etc.) and, if you know it, memory speed.
> >
> >This is an example of a complex multilevel model with crossed grouping
> >factors fit to a relatively large (30000 observations on 10000
> >students, 1400 teachers and 80 schools) data set.
> >
> >______________________________________________
> >R-devel at r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
>
More information about the R-devel
mailing list