[Rd] multiple issues with is.unsorted()

Hervé Pagès hpages at fhcrc.org
Wed Apr 24 21:06:55 CEST 2013



On 04/24/2013 12:00 PM, Hervé Pagès wrote:
> Hi,
>
> On 04/24/2013 09:27 AM, William Dunlap wrote:
>>>     >>> is.unsorted(NA)
>>>     >> [1] NA
>>>     >> => Contradicts "all objects of length 0 or 1 are sorted".
>>>
>>> Ok.  I really think we should change the above.
>>> If NA is for a missing number, it still cannot be unsorted if it
>>> is of length one.
>>>
>>> --> the above will give FALSE  "real soon now".
>>
>> It depends what you are using the result of is.unsorted() for.  If you
>> want
>> to know if you can save time by not calling x<-sort(x)  then
>> is.unsorted(NA)
>> should not say that NA is sorted, as sort(NA) has length 0.
>
> Glad you mention this. This is related but actually a different issue
> which is that by default is.unsorted() and sort() don't treat NAs
> consistently: the former keeps them, the latter removes them. So if
> you want to use is.unsorted() for deciding whether or not you're going
> to call sort() (without specifying 'na.last'), you should do
> 'is.unsorted( , na.rm=TRUE)'.
>
> This is why IMO 'is.unsorted( , na.rm=TRUE)' is an important use case
> and should be as fast as possible.
>
> If you want to keep NAs, you'll have to sort 'x' with either
> na.last=TRUE or na.last=FALSE. So it makes a lot of sense that
> is.unsorted(x) returns FALSE if x is a single NA, because, in that
> case, 'x' doesn't need to be sorted.

And I should add that, for that use case (want to keep NAs when
sorting), is.unsorted() is totally useless anyway because it will
return NA if 'x' has length >= 2 and contains NAs :-/

Cheers,
H.

>
> Cheers,
> H.
>
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>
>>> -----Original Message-----
>>> From: r-devel-bounces at r-project.org
>>> [mailto:r-devel-bounces at r-project.org] On Behalf
>>> Of Martin Maechler
>>> Sent: Wednesday, April 24, 2013 8:41 AM
>>> To: Hervé Pagès; R-devel at stat.math.ethz.ch
>>> Cc: Martin Maechler
>>> Subject: Re: [Rd] multiple issues with is.unsorted()
>>>
>>> More comments .. see inline
>>>
>>>>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>>>>      on Wed, 24 Apr 2013 11:29:39 +0200 writes:
>>>
>>>      > Dear Herve,
>>>>>>>> Hervé Pagès <hpages at fhcrc.org>
>>>>>>>> on Tue, 23 Apr 2013 23:09:21 -0700 writes:
>>>
>>>      >> Hi, In the man page for is.unsorted():
>>>
>>>      >> Value:
>>>
>>>      >> A length-one logical value.  All objects of length 0 or 1
>>>      >> are sorted: the result will be ‘NA’ for objects of length
>>>      >> 2 or more except for atomic vectors and objects with a
>>>      >> class (where the ‘>=’ or ‘>’ method is used to compare
>>>      >> ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).
>>>
>>>      >> This contains many incorrect statements:
>>>
>>>      >>> length(NA)
>>>      >> [1] 1
>>>      >>> is.unsorted(NA)
>>>      >> [1] NA
>>>      >>> length(list(NA))
>>>      >> [1] 1
>>>      >>> is.unsorted(list(NA))
>>>      >> [1] NA
>>>
>>>      >> => Contradicts "all objects of length 0 or 1 are sorted".
>>>
>>> Ok.  I really think we should change the above.
>>> If NA is for a missing number, it still cannot be unsorted if it
>>> is of length one.
>>>
>>> --> the above will give FALSE  "real soon now".
>>>
>>>      >>> is.unsorted(raw(2))
>>>      >> Error in is.unsorted(raw(2)) : unimplemented type 'raw'
>>>      >> in 'isUnsorted'
>>>
>>>      >> => Doesn't agree with the doc (unless "except for atomic
>>>      >> vectors" means "it might fail for atomic vectors").
>>>
>>> Well, the doc says about 'x'
>>> |  \item{x}{an \R object with a class or a numeric, complex,
>>> character or
>>> |    logical vector.}
>>> so strictly, is.unsorted() is not to be used on raw vectors.
>>>
>>> However I think you have a point:
>>> Raw vectors didn't exist when  is.unsorted()  was
>>> invented, so where not considered back then.
>>> Originally,  raw vectors were really almost only there for
>>> storage, i.e. basically read and write, but now we have
>>> as '<' , '<=' '=='  etc  working well for raw() ,
>>> we could allow  is.unsorted() to work, too.
>>>
>>> Note however, that if you try to sort(<raw>) you also always get
>>> an error about sort() not being implemented for raw(),...
>>> something we could arguably reconsider, as we admitted the
>>> relational operators (< <= == >= >  != ) to work.
>>> {{anyone donating patches to R-devel for sort()ing raw ?}}
>>>
>>>
>>>      >>> setClass("A", representation(aa="integer"))
>>>      >>> new("A", aa=4:1)
>>>      >>> length(a)
>>>      >> [1] 1
>>>
>>>      >>> is.unsorted(a)
>>>      >> [1] FALSE
>>>      >>  Warning message: In is.na(x) : is.na() applied
>>>      >> to non-(list or vector) of type 'S4'
>>>
>>>      >> => Ok, but it's arguable the warning is useful/justified
>>>      >> from a user point of view. The warning *seems* to suggest
>>>      >> that defining an "is.na" method for my objects is
>>>      >> required for is.unsorted() to work properly but the doc
>>>      >> doesn't make this clear.
>>>
>>> you are right.
>>> We are going to improve on this, at least the documentation.
>>>
>>>
>>> [.................]
>>>
>>> The S4 part I've already started addressing in the last reply.
>>> (and we may get back to that.. )
>>>
>>> [.................]
>>>
>>> Martin
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list