R-alpha: Sorting Efficiency -- thoughts.. : if(!is.sorted(x)) x <- sort(x)

Martin Maechler Martin Maechler <maechler@stat.math.ethz.ch>
Fri, 5 Sep 1997 10:32:32 +0200

Since you are delving into the code anyway....
Here is a thought that I have been having for a long time:

Wouldn't it make sense / be useful  to have a function 	is.sorted
which returns TRUE   if  all(diff(x) >= 0).	        ---------

I actually have been using the function

	is.sorted <- function(x) (length(x) <= 1) || all(diff(x) >= 0)

in my collection of S-plus enhancements for quite while now.

I can do things as

	if(!is.sorted(x)) {
		o <- order(x)
		x <- x[o]
		y <- y[o]
which potentially saves computing time (over unconditional 'order'),
1.	        is.sorted  is  O(n)
    whereas	sort       is  O(n*log(n)) 
2. for sorted x, the two assignments are superfluous even when order(x) was
   very fast (for SORTED x). 

TO be CPU-efficient, is.sorted  probably should be written .Internal(.).

On the other hand,  sort(.) and order(.)  could be designed such that they
use a sorting algorithm which guarantees to be fast when things are already
sorted. [maybe this is already the case,
	 and one of your reasons against quicksort? ]
This would make the 'if(..)' in  
	if(!is.sorted(x)) x <- sort(x)
superfluous, but the situation with order(.) above
(which is NOT infrequent in statistics !!) would still profit from not
doing  'x <- x[1:length(x)]' when x is sorted.


r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch