R-alpha: Sorting Efficiency

Ross Ihaka ihaka@stat.auckland.ac.nz
Fri, 5 Sep 1997 08:39:05 +1200 (NZST)


As part of checking Martin Maechler's observations of sorting
efficiency I have been putting the code under the microscope.  Part of
this involved comparing the performance with Splus. I turns out that
the R code is much slower.  A good deal of this is due to the
interpreted front end to the R verion of sort.  It is worth taking
a look at the sort function and seeing if you can see what the
problem is.  (hint: turn on GC reporting with "gcinfo").

A good deal of the rest of the inefficiency is because the internal R
code attempts to deal with missing observations.  This is no longer
needed because this is dealt with in the interpreted front end.

An overall fix of the code should see us nearly match the performance
of Splus.  I say nearly because I think I will use heapsort instead of
quicksort because I want to avoid worst-case performance rather than
seek good average performance.

I don't yet have an explanation of the behavior which Martin uncovered.
	Ross
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-