[R] Sorting a Data Frame by hybrid string / number key
William Dunlap
wdunlap at tibco.com
Thu Feb 3 17:22:27 CET 2011
To sort a character vector in a desired order
you can convert it to a factor with the levels
in the desired order. To sort strings like "2"
and "11" in numerical order, use convert them
to numbers with as.numeric. To sort by two variables,
using the second to break ties in the first,
use data[order(first, second),]. E.g.,
> library(stringr)
> d <- data.frame(instance =
+
c("competition11","competition01","big_20","small_4","small_2","med_9"))
> mySortByInstance <- function(data) {
+ # assume data$instance is of form <type><number>, perhaps
+ # with underscore between. Sort by type, breaking ties
+ # with number.
+ id <- as.numeric(str_extract(data$instance, "\\d{1,}$"))
+ type <- factor(str_extract(data$instance, "^[[:alpha:]]+"),
+ levels=c("competition", "small", "med", "big"))
+ data[order(type, id), , drop=FALSE]
+ }
> mySortByInstance(d)
instance
2 competition01
1 competition11
5 small_2
4 small_4
6 med_9
3 big_20
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Alastair
> Sent: Thursday, February 03, 2011 7:13 AM
> To: r-help at r-project.org
> Subject: [R] Sorting a Data Frame by hybrid string / number key
>
>
> Hi,
>
> I'm trying to present a table of some experimental data, and
> I want to order
> the rows by the instance names. The issue I've got is that there are a
> variety of conventions for the instance names (e.g. competition01,
> competition13, small_1, big_20, med_9). What I want to be
> able to sort them
> first in category order so: competition < small < med < big, and then
> perform the secondary ordering by the final one or two digits.
>
> I've used Hadley Wickham's StringR package to split the names into the
> string and numeric sections so I could get those ordered
> easily enough. What
> I'm struggling with is how to sort the categories (because I
> don't want them
> in a straight alphabetic order).
>
> library(stringr)
> d <- data.frame(instance =
> c("competition11","competition01","big_20","small_4","small_2"
,"med_9"))
> id <- str_extract(d$instance, "\\d{1,}$")
>
> Any pointers would be gratefully received.
> Thanks,
> Alastair
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Sorting-a-Data-Frame-by-hybrid-s
tring-number-key-tp3258283p3258283.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list