[R] Sorting based a custom sorting function

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Thu Dec 14 12:02:09 CET 2023

On 14/12/2023 3:00 a.m., Martin Møller Skarbiniks Pedersen wrote:
> Hi,
>    I need to sort a data.frame based on a custom sorting function.
>    It is easy in many languages but I can't find a way to do it in R.
>    In many cases I could just use an ordered factor but my data.frame
> contains poker hands and
> I need to rank these hands. I already got a function that compares two hands.
> Here is a MRE (Minimal, Reproducible Example):
> df <- data.frame(person = c("Alice", "Bob", "Charlie"), value =
> c("Medium", "Small", "Large"))
> # 0 means equal, -1 means left before right, 1 means right before left
> custom_sort <- function(left, right) {
>    if (left == right) return(0)
>    if (left == "Small") return(-1)
>    if (left == "Medium" & right == "Large") return(-1)
>    return(1)
> }
> #  sort df according to custom_soft
> # expect output is a data.frame:
> #     name   size
> # 1     Bob Medium
> # 2   Alice  Small
> # 3 Charlie  Large
> In this simple case I can just use an ordered factor but what about
> the poker hands situation?

The general way in base R is to put the objects in a vector (which might 
be a list if they are complex objects), assign a class to that vector, 
and define either an xtfrm method or methods for ==, >, is.na, and 
extraction for that vector.  The xtfrm method is basically
the same as using an ordered factor, so I'll skip that, and show you the 
other way:

For your example, you could do it like this:

class(df$value) <- "sizeclass"

`>.sizeclass` <- function(left, right) custom_sort(unclass(left), 
unclass(right)) == 1

`==.sizeclass` <- function(left, right) custom_sort(unclass(left), 
unclass(right)) == 0

`[.sizeclass` <- function(x, i) structure(unclass(x)[i], class="sizeclass")


All the "unclass()" calls are needed to avoid infinite recursion.  For a 
more complex kind of object where you are extracting attributes to 
compare, you probably wouldn't need so many of those.

There are likely other ways to do this in particular packages such as 
dplyr or data.table.

Duncan Murdoch

More information about the R-help mailing list