[R] Ordering long vectors
Göran Broström
gb at stat.umu.se
Mon Jun 9 18:37:07 CEST 2003
On Mon, 9 Jun 2003, Prof Brian Ripley wrote:
> On Sun, 8 Jun 2003, Göran Broström wrote:
>
> > On Sun, 8 Jun 2003, Thomas Lumley wrote:
> >
> > > On Sat, 7 Jun 2003, [ISO-8859-1] Göran Broström wrote:
> > >
> > > >
> > > > I need to order a long vector of integers with rather few unique values.
> > > > This is very slow:
> > >
> > >
> > > I think the culprit is
> > >
> > > src/main/sort.c: orderVector1
> > >
> > > /* Shell sort isn't stable, but it proves to be somewhat faster
> > > to run a final insertion sort to re-order runs of ties when
> > > comparison is cheap.
> > > */
> > >
> > > This also explains:
> > >
> > > > aa<-sample(rep(1:10,50000))
> > > > system.time( order(aa, 1:length(aa)))
> > > [1] 3.67 0.01 3.68 0.00 0.00
> > > > system.time( order(aa))
> > > ^C
> > > Timing stopped at: 49.33 0.01 49.34 0 0
> > >
> > > which is perhaps the simplest work-around :).
> >
>
> There is a simple and very much faster solution if you don't care about
> ordering of ties:
>
> system.time(ord4 <- sort(x, method="quick", index.return = TRUE)$ix)
Thanks, it is indeed much faster; about 7 times on my example.
> That is in the help, and I am very surprised you failed to find it.
Well, I found it, but I was cooled off by the "somewhat faster than
Shellsort" and "poor performance in the worst case", together with the
fact that I only needed the order and not the sorted vector. Maybe a more
positive description is in order :-). Especially on the help page of 'order'.
G.
More information about the R-help
mailing list