[R] sorting without order
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Nov 23 13:58:30 CET 2004
On Tue, 23 Nov 2004, Peter Dalgaard wrote:
> "Marc Mamin" <M.Mamin at intershop.de> writes:
>> In order to increase the performance of a script I'd like to sort very large vectors containing repeated integer values.
>> I'm not interesting in having the values sorted, but only grouped.
>> I also need the equivalent of index.return from the standard "sort" function:
>> grouped: c(10,10,10,1,1,100)
>> ix: c(1,3,6,2,5,4)
>> is there a way to achieve this which would be faster than the standard sort function?
>> Thanks for any hints,
> Here's one way:
> v <- c(10,1,10,100,1,10)
> ix <- do.call("c",split(seq(along=v),v))
> grouped <- v[ix]
> Not sure about the speed though. Should be O(N) if the number of
> groups is small, but the multiplier could be large because of various
> formalities (such as adding names to ix).
Radix sorting as implemented in sort.list will be hard to beat:
ix <- sort.list(unclass(factor(v)), method="radix")
although in your case as.integer() will do.
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help