[Rd] Wanted: sort.data.frame
Kevin Wright
kwright at eskimo.com
Wed Jul 21 18:25:54 CEST 2004
I've often wanted a function for sorting a data frame by multiple columns.
I know it's not too hard to do using the order function, but given the
frequency of questions about this on R-help, it would seem to me the task
could be simplified.
d = data.frame(x=c("A","D","A","C"),y=c(8,3,9,9),z=c(1,1,1,1))
My first attempt was something like
sort.data.frame(d, by=c(x,y))
which immediately failed since x is not an object.
I then decided something like this would be better
sort.data.frame(d, ~ x -y +z)
where + indicates ascending and - indicates descending.
This ordering of the arguments seems natural to me, but in order to
be consistent with other functions that have formulae it would
probably be better to use sort.data.frame(formula, data).
I spent an hour or so working on this and then admitted that manipulating
R formlas is not one of my stronger skills. (I would have use a loop, and
then use substitute and eval(parse(text=...)) and would be embarassed
for anyone to see it.)
My personal feeling is that this function would be quite helpful and
reduce
the frequency of sort questions on R-help.
This might be a nice, modest programming challenge for R gurus.
All contributors with such a function would have my respect.
Kevin Wright
More information about the R-devel
mailing list