[R] Ordering long vectors

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jun 9 15:47:18 CEST 2003


On Sun, 8 Jun 2003, Göran Broström wrote:

> On Sun, 8 Jun 2003, Thomas Lumley wrote:
> 
> > On Sat, 7 Jun 2003, [ISO-8859-1] Göran Broström wrote:
> > 
> > >
> > > I need to order a long vector of integers with rather few unique values.
> > > This is very slow:
> > 
> > 
> > I think the culprit is
> > 
> > src/main/sort.c: orderVector1
> > 
> >     /* Shell sort isn't stable, but it proves to be somewhat faster
> >        to run a final insertion sort to re-order runs of ties when
> >        comparison is cheap.
> >     */
> > 
> > This also explains:
> > 
> > > aa<-sample(rep(1:10,50000))
> > > system.time( order(aa, 1:length(aa)))
> > [1] 3.67 0.01 3.68 0.00 0.00
> > > system.time( order(aa))
> > ^C
> > Timing stopped at: 49.33 0.01 49.34 0 0
> > 
> > which is perhaps the simplest work-around :).
> 

There is a simple and very much faster solution if you don't care about 
ordering of ties:

system.time(ord4 <- sort(x, method="quick", index.return = TRUE)$ix)

That is in the help, and I am very surprised you failed to find it.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list