[Rd] multiple issues with is.unsorted()
Hervé Pagès
hpages at fhcrc.org
Wed Apr 24 21:06:55 CEST 2013
On 04/24/2013 12:00 PM, Hervé Pagès wrote:
> Hi,
>
> On 04/24/2013 09:27 AM, William Dunlap wrote:
>>> >>> is.unsorted(NA)
>>> >> [1] NA
>>> >> => Contradicts "all objects of length 0 or 1 are sorted".
>>>
>>> Ok. I really think we should change the above.
>>> If NA is for a missing number, it still cannot be unsorted if it
>>> is of length one.
>>>
>>> --> the above will give FALSE "real soon now".
>>
>> It depends what you are using the result of is.unsorted() for. If you
>> want
>> to know if you can save time by not calling x<-sort(x) then
>> is.unsorted(NA)
>> should not say that NA is sorted, as sort(NA) has length 0.
>
> Glad you mention this. This is related but actually a different issue
> which is that by default is.unsorted() and sort() don't treat NAs
> consistently: the former keeps them, the latter removes them. So if
> you want to use is.unsorted() for deciding whether or not you're going
> to call sort() (without specifying 'na.last'), you should do
> 'is.unsorted( , na.rm=TRUE)'.
>
> This is why IMO 'is.unsorted( , na.rm=TRUE)' is an important use case
> and should be as fast as possible.
>
> If you want to keep NAs, you'll have to sort 'x' with either
> na.last=TRUE or na.last=FALSE. So it makes a lot of sense that
> is.unsorted(x) returns FALSE if x is a single NA, because, in that
> case, 'x' doesn't need to be sorted.
And I should add that, for that use case (want to keep NAs when
sorting), is.unsorted() is totally useless anyway because it will
return NA if 'x' has length >= 2 and contains NAs :-/
Cheers,
H.
>
> Cheers,
> H.
>
>>
>> Bill Dunlap
>> Spotfire, TIBCO Software
>> wdunlap tibco.com
>>
>>
>>> -----Original Message-----
>>> From: r-devel-bounces at r-project.org
>>> [mailto:r-devel-bounces at r-project.org] On Behalf
>>> Of Martin Maechler
>>> Sent: Wednesday, April 24, 2013 8:41 AM
>>> To: Hervé Pagès; R-devel at stat.math.ethz.ch
>>> Cc: Martin Maechler
>>> Subject: Re: [Rd] multiple issues with is.unsorted()
>>>
>>> More comments .. see inline
>>>
>>>>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>>>> on Wed, 24 Apr 2013 11:29:39 +0200 writes:
>>>
>>> > Dear Herve,
>>>>>>>> Hervé Pagès <hpages at fhcrc.org>
>>>>>>>> on Tue, 23 Apr 2013 23:09:21 -0700 writes:
>>>
>>> >> Hi, In the man page for is.unsorted():
>>>
>>> >> Value:
>>>
>>> >> A length-one logical value. All objects of length 0 or 1
>>> >> are sorted: the result will be ‘NA’ for objects of length
>>> >> 2 or more except for atomic vectors and objects with a
>>> >> class (where the ‘>=’ or ‘>’ method is used to compare
>>> >> ‘x[i]’ with ‘x[i-1]’ for ‘i’ in ‘2:length(x)’).
>>>
>>> >> This contains many incorrect statements:
>>>
>>> >>> length(NA)
>>> >> [1] 1
>>> >>> is.unsorted(NA)
>>> >> [1] NA
>>> >>> length(list(NA))
>>> >> [1] 1
>>> >>> is.unsorted(list(NA))
>>> >> [1] NA
>>>
>>> >> => Contradicts "all objects of length 0 or 1 are sorted".
>>>
>>> Ok. I really think we should change the above.
>>> If NA is for a missing number, it still cannot be unsorted if it
>>> is of length one.
>>>
>>> --> the above will give FALSE "real soon now".
>>>
>>> >>> is.unsorted(raw(2))
>>> >> Error in is.unsorted(raw(2)) : unimplemented type 'raw'
>>> >> in 'isUnsorted'
>>>
>>> >> => Doesn't agree with the doc (unless "except for atomic
>>> >> vectors" means "it might fail for atomic vectors").
>>>
>>> Well, the doc says about 'x'
>>> | \item{x}{an \R object with a class or a numeric, complex,
>>> character or
>>> | logical vector.}
>>> so strictly, is.unsorted() is not to be used on raw vectors.
>>>
>>> However I think you have a point:
>>> Raw vectors didn't exist when is.unsorted() was
>>> invented, so where not considered back then.
>>> Originally, raw vectors were really almost only there for
>>> storage, i.e. basically read and write, but now we have
>>> as '<' , '<=' '==' etc working well for raw() ,
>>> we could allow is.unsorted() to work, too.
>>>
>>> Note however, that if you try to sort(<raw>) you also always get
>>> an error about sort() not being implemented for raw(),...
>>> something we could arguably reconsider, as we admitted the
>>> relational operators (< <= == >= > != ) to work.
>>> {{anyone donating patches to R-devel for sort()ing raw ?}}
>>>
>>>
>>> >>> setClass("A", representation(aa="integer"))
>>> >>> new("A", aa=4:1)
>>> >>> length(a)
>>> >> [1] 1
>>>
>>> >>> is.unsorted(a)
>>> >> [1] FALSE
>>> >> Warning message: In is.na(x) : is.na() applied
>>> >> to non-(list or vector) of type 'S4'
>>>
>>> >> => Ok, but it's arguable the warning is useful/justified
>>> >> from a user point of view. The warning *seems* to suggest
>>> >> that defining an "is.na" method for my objects is
>>> >> required for is.unsorted() to work properly but the doc
>>> >> doesn't make this clear.
>>>
>>> you are right.
>>> We are going to improve on this, at least the documentation.
>>>
>>>
>>> [.................]
>>>
>>> The S4 part I've already started addressing in the last reply.
>>> (and we may get back to that.. )
>>>
>>> [.................]
>>>
>>> Martin
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list