[Rd] [R] custom sort?
Duncan Murdoch
murdoch at stats.uwo.ca
Fri May 29 19:02:48 CEST 2009
On 5/29/2009 9:28 AM, Duncan Murdoch wrote:
> I've moved this to R-devel...
>
> On 5/28/2009 8:17 PM, Stavros Macrakis wrote:
>> I couldn't get your suggested method to work:
>>
>> `==.foo` <- function(a,b) unclass(a)==unclass(b)
>> `>.foo` <- function(a,b) unclass(a) < unclass(b) # invert comparison
>> is.na.foo <- function(a)is.na(unclass(a))
>>
>> sort(structure(sample(5),class="foo")) #-> 1:5 -- not reversed
>>
>> What am I missing?
>
> There are two problems. First, I didn't mention that you need a method
> for indexing as well. The code needs to evaluate things like x[i] >
> x[j], and by default x[i] will not be of class "foo", so the custom
> comparison methods won't be called.
>
> Second, I think there's a bug in the internal code, specifically in
> do_rank or orderVector1 in sort.c: orderVector1 ignores the class of x.
> do_rank pays attention when breaking ties, so I think this is an
> oversight.
>
> So I'd say two things should be done:
>
> 1. the bug should be fixed. Even if this isn't the most obvious
> approach, it should work.
I've now fixed the bug, and clarified the documentation to say
The default method will make use of == and > methods
for the class of x[i] (for integers i), and the
is.na method for the class of x, but might be rather
slow when doing so.
You don't actually need a custom indexing method, you just need to be
aware that it's the class of x[i] that is important for comparisons.
This will make it into R-patched and R-devel.
Duncan Murdoch
>
> 2. we should look for ways to make all of this simpler, e.g. allowing
> a comparison function to be used.
>
> I'll take on 1, but not 2. It's hard to work out the right place for
> the comparison function to appear, and it would require a lot of work to
> implement, because all of this stuff (sort, rank, order, xtfrm,
> sort.int, etc.) is closely interrelated, some but not all of the
> functions are S3 generics, some implemented internally, etc. In the
> end, I'd guess the results won't be very satisfactory from a performance
> point of view: all those calls out to R to do the comparisons are going
> to be really slow.
>
> I think your advice to use order() with multiple keys is likely to be
> much faster in most instances. It's just a better approach in R.
>
> Duncan Murdoch
>
>>
>> -s
>>
>> On Thu, May 28, 2009 at 5:48 PM, Duncan Murdoch <murdoch at stats.uwo.ca>wrote:
>>
>>> On 28/05/2009 5:34 PM, Steve Jaffe wrote:
>>>
>>>> Sounds simple but haven't been able to find it in docs: is it possible to
>>>> sort a vector using a user-defined comparison function? Seems it must be,
>>>> but "sort" doesn't seem to provide that option, nor does "order" sfaics
>>>>
>>>
>>> You put a class on the vector (e.g. using class(x) <- "myvector"), then
>>> define a conversion to numeric (e.g. xtfrm.myvector) or actual comparison
>>> methods (you'll need ==.myvector, >.myvector, and is.na.myvector).
>>>
>>> Duncan Murdoch
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-devel
mailing list