[Rd] [R] custom sort?
Duncan Murdoch
murdoch at stats.uwo.ca
Thu Jun 4 22:58:23 CEST 2009
Stavros Macrakis wrote:
> Thanks for the quick fix!
>
It was quick in R-devel, not so quick in R-patched. I forgot to commit
the change, but someone accidentally ported my NEWS item about it over,
so I thought I really had done it when I looked from another computer
the next day. Then I headed out on the road...
Will commit it to R-patched (from the Vancouver airport) in a few minutes.
Duncan Murdoch
> -s
>
> On Fri, May 29, 2009 at 1:02 PM, Duncan Murdoch <murdoch at stats.uwo.ca>wrote:
>
>
>> On 5/29/2009 9:28 AM, Duncan Murdoch wrote:
>>
>>
>>> I've moved this to R-devel...
>>>
>>> On 5/28/2009 8:17 PM, Stavros Macrakis wrote:
>>>
>>>
>>>> I couldn't get your suggested method to work:
>>>>
>>>> `==.foo` <- function(a,b) unclass(a)==unclass(b)
>>>> `>.foo` <- function(a,b) unclass(a) < unclass(b) # invert comparison
>>>> is.na.foo <- function(a)is.na(unclass(a))
>>>>
>>>> sort(structure(sample(5),class="foo")) #-> 1:5 -- not reversed
>>>>
>>>> What am I missing?
>>>>
>>>>
>>> There are two problems. First, I didn't mention that you need a method
>>> for indexing as well. The code needs to evaluate things like x[i] > x[j],
>>> and by default x[i] will not be of class "foo", so the custom comparison
>>> methods won't be called.
>>>
>>> Second, I think there's a bug in the internal code, specifically in
>>> do_rank or orderVector1 in sort.c: orderVector1 ignores the class of x.
>>> do_rank pays attention when breaking ties, so I think this is an oversight.
>>>
>>> So I'd say two things should be done:
>>>
>>> 1. the bug should be fixed. Even if this isn't the most obvious
>>> approach, it should work.
>>>
>>>
>> I've now fixed the bug, and clarified the documentation to say
>>
>> The default method will make use of == and > methods
>> for the class of x[i] (for integers i), and the
>> is.na method for the class of x, but might be rather
>> slow when doing so.
>>
>> You don't actually need a custom indexing method, you just need to be aware
>> that it's the class of x[i] that is important for comparisons.
>>
>> This will make it into R-patched and R-devel.
>>
>> Duncan Murdoch
>>
>>
>>
>>
>>> 2. we should look for ways to make all of this simpler, e.g. allowing a
>>> comparison function to be used.
>>>
>>> I'll take on 1, but not 2. It's hard to work out the right place for the
>>> comparison function to appear, and it would require a lot of work to
>>> implement, because all of this stuff (sort, rank, order, xtfrm, sort.int,
>>> etc.) is closely interrelated, some but not all of the functions are S3
>>> generics, some implemented internally, etc. In the end, I'd guess the
>>> results won't be very satisfactory from a performance point of view: all
>>> those calls out to R to do the comparisons are going to be really slow.
>>>
>>> I think your advice to use order() with multiple keys is likely to be much
>>> faster in most instances. It's just a better approach in R.
>>>
>>> Duncan Murdoch
>>>
>>>
>>>
>>>> -s
>>>>
>>>> On Thu, May 28, 2009 at 5:48 PM, Duncan Murdoch <murdoch at stats.uwo.ca
>>>>
>>>>> wrote:
>>>>>
>>>> On 28/05/2009 5:34 PM, Steve Jaffe wrote:
>>>>
>>>>> Sounds simple but haven't been able to find it in docs: is it possible
>>>>>
>>>>>> to
>>>>>> sort a vector using a user-defined comparison function? Seems it must
>>>>>> be,
>>>>>> but "sort" doesn't seem to provide that option, nor does "order" sfaics
>>>>>>
>>>>>>
>>>>>>
>>>>> You put a class on the vector (e.g. using class(x) <- "myvector"), then
>>>>> define a conversion to numeric (e.g. xtfrm.myvector) or actual
>>>>> comparison
>>>>> methods (you'll need ==.myvector, >.myvector, and is.na.myvector).
>>>>>
>>>>> Duncan Murdoch
>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>
>>>
>
>
More information about the R-devel
mailing list