[R] performance gap between R 1.7.1 and 1.8.0

Liaw, Andy andy_liaw at merck.com
Sat Nov 29 04:19:27 CET 2003


Dear R-help,

A colleague of mine was running some code on two of our boxes, and noticed a
rather large difference in running time.  We've so far isolated the problem
to the difference between R 1.7.1 and 1.8.0, but not more than that.  The
exact same code took 933.5 seconds in 1.7.1, and 3594.4 seconds in 1.8.1, on
the same box.

Basically, the code calls boot() to bootstrap fitting mixture models by
calling flexmix() (in the flexmix package) with intercept-only models.
The code needs to be strip down further, but I thought some of you might be
able to tell what's wrong from the info we have so far:

I ran R profiling on the code, under both R 1.7.1 and 1.8.1 (but the
performance gap is there in 1.8.0 already).  Sorted by "self" part of the
output, the top 10 lines are:

R 1.7.1
   %       self        %       total
 self     seconds    total    seconds    name
 10.75    100.36     10.75    100.36     ".Fortran"
  8.71     81.32     34.27    319.88     "lm.wfit"
  5.43     50.66     81.19    757.88     "FLXfit"
  4.30     40.16      4.30     40.16     "^"
  4.26     39.80      4.26     39.80     "=="
  4.15     38.74      4.99     46.62     "names"
  3.90     36.38     20.57    191.98     "initialize"
  3.51     32.80      5.37     50.14     "dnorm"
  2.29     21.34      4.94     46.14     "hclass"
  1.83     17.10      5.85     54.64     "inherits"

R 1.8.1:
   %       self        %       total
 self     seconds    total    seconds    name
  6.24    224.26     11.69    420.32     "paste"
  5.93    213.04     13.21    474.76     "read.dcf"
  4.37    157.24      5.17    185.92     "names"
  4.18    150.42      5.53    198.66     "exists"
  3.71    133.52     14.66    527.00     "lapply"
  3.15    113.16      4.43    159.32     "names<-"
  2.98    107.26      2.98    107.26     ".Fortran"
  2.57     92.46      5.75    206.82     "seq"
  2.54     91.42      3.18    114.14     "seq.default"
  2.37     85.24      8.93    320.96     "lm.wfit"

The ".Fortran" call took about the same amount of time, as does "lm.wfit",
so that's a bit comforting.  (The code also fits the same model using
mclust, and that's probably where the .Fortran call is from.)  What puzzled
me are:

-  Several functions took much longer in 1.8.1; e.g., 
          1.7.1  1.8.1 
  "paste"    28    224
  "names"    38    157
  "names<-"  15    113
  "exists"    4    150

-  The presence of "read.dcf" in 1.8.1.  Where could this be from?

Any clues as to why we're seeing this?

Best,
Andy

Andy Liaw, PhD
Biometrics Research      PO Box 2000, RY33-300     
Merck Research Labs           Rahway, NJ 07065
mailto:andy_liaw at merck.com        732-594-0820




More information about the R-help mailing list