[R] Ordering long vectors

Göran Broström gb at stat.umu.se
Mon Jun 9 18:37:07 CEST 2003


On Mon, 9 Jun 2003, Prof Brian Ripley wrote:

> On Sun, 8 Jun 2003, Göran Broström wrote:
> 
> > On Sun, 8 Jun 2003, Thomas Lumley wrote:
> > 
> > > On Sat, 7 Jun 2003, [ISO-8859-1] Göran Broström wrote:
> > > 
> > > >
> > > > I need to order a long vector of integers with rather few unique values.
> > > > This is very slow:
> > > 
> > > 
> > > I think the culprit is
> > > 
> > > src/main/sort.c: orderVector1
> > > 
> > >     /* Shell sort isn't stable, but it proves to be somewhat faster
> > >        to run a final insertion sort to re-order runs of ties when
> > >        comparison is cheap.
> > >     */
> > > 
> > > This also explains:
> > > 
> > > > aa<-sample(rep(1:10,50000))
> > > > system.time( order(aa, 1:length(aa)))
> > > [1] 3.67 0.01 3.68 0.00 0.00
> > > > system.time( order(aa))
> > > ^C
> > > Timing stopped at: 49.33 0.01 49.34 0 0
> > > 
> > > which is perhaps the simplest work-around :).
> > 
> 
> There is a simple and very much faster solution if you don't care about 
> ordering of ties:
> 
> system.time(ord4 <- sort(x, method="quick", index.return = TRUE)$ix)

Thanks, it is indeed much faster; about 7 times on my example. 

> That is in the help, and I am very surprised you failed to find it.

Well, I found it, but I was cooled off by the "somewhat faster than 
Shellsort" and "poor performance in the worst case", together with the 
fact that I only needed the order and not the sorted vector. Maybe a more 
positive description is in order :-). Especially on the help page of 'order'.

G.




More information about the R-help mailing list