R-alpha: speed of sort(.) and order(.)
Martin Maechler
Martin Maechler <maechler@stat.math.ethz.ch>
Wed, 3 Sep 1997 18:13:16 +0200
sort() and order() are not quite the same, as "one knows":
o order allows breaking ties by more than one argument;
o sort allows a 'partial' and 'na.last' argument
Still, the following timing (on a `simple' UltraSparc I)
suggest that actually two different algorithms are used
> N <- 10000
> typeof(x0 <- 1:N) # --- x0 : already sorted ---
[1] "integer"
> typeof(x <- sample(N))
[1] "integer"
> CPU <- numeric(10)
> for(j in 1:10) CPU[j] <- unix.time(for(i in 1:20) sort(x0))[1]
> summary(CPU)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.070 1.080 1.080 1.137 1.217 1.240
> for(j in 1:10) CPU[j] <- unix.time(for(i in 1:20) order(x0))[1]
> summary(CPU)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.900 1.000 1.035 1.012 1.047 1.070
'order' is slightly faster for an already sorted vector
(this seems consistent for further examples,
not statistically significant for these 10..)
> for(j in 1:10) CPU[j] <- unix.time(for(i in 1:20) sort(x))[1]
> summary(CPU)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.700 1.720 1.735 1.765 1.827 1.870
> for(j in 1:10) CPU[j] <- unix.time(for(i in 1:20) order(x))[1]
> summary(CPU)
Min. 1st Qu. Median Mean 3rd Qu. Max.
3.000 3.100 3.140 3.181 3.227 3.540
>
which makes 'sort' almost a factor of 2 FASTER for an unsorted 'x'.
Any comments from the coders (R & R) ?
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-