[R] Ordering long vectors
Göran Broström
gb at stat.umu.se
Sat Jun 7 18:29:20 CEST 2003
I need to order a long vector of integers with rather few unique values.
This is very slow:
> x <- sample(rep(c(1:10), 50000))
> system.time(ord <- order(x))
[1] 189.18 0.09 190.48 0.00 0.00
But with no ties
> y <- sample(500000)
> system.time(ord1 <- order(y))
[1] 1.18 0.00 1.18 0.00 0.00
it is very fast!
This gave me the following idea: Since I don't care about keeping the
order within tied values, why not add some small disturbance to x,
and indeed,
> unix.time(ord2 <- order(x + runif(length(x), -0.1, 0.1)))
[1] 1.66 0.00 1.66 0.00 0.00
> identical(x[ord], x[ord2])
[1] TRUE
it works!
Is there an obvious (=better) solution to this problem that I have
overlooked? In any case, I think that the problem with order and many
ties is worth mentioning in the help page.
For the record: R-1.7.0, RH9
Göran
---
Göran Broström tel: +46 90 786 5223
Department of Statistics fax: +46 90 786 6614
Umeå University http://www.stat.umu.se/egna/gb/
SE-90187 Umeå, Sweden e-mail: gb at stat.umu.se
More information about the R-help
mailing list