[Rd] multiple issues with is.unsorted()
Hervé Pagès
hpages at fhcrc.org
Wed Apr 24 08:09:21 CEST 2013
Hi,
In the man page for is.unsorted():
Value:
A length-one logical value. All objects of length 0 or 1 are
sorted: the result will be ‘NA’ for objects of length 2 or more
except for atomic vectors and objects with a class (where the ‘>=’
or ‘>’ method is used to compare ‘x[i]’ with ‘x[i-1]’ for ‘i’ in
‘2:length(x)’).
This contains many incorrect statements:
> length(NA)
[1] 1
> is.unsorted(NA)
[1] NA
> length(list(NA))
[1] 1
> is.unsorted(list(NA))
[1] NA
=> Contradicts "all objects of length 0 or 1 are sorted".
> is.unsorted(raw(2))
Error in is.unsorted(raw(2)) : unimplemented type 'raw' in
'isUnsorted'
=> Doesn't agree with the doc (unless "except for atomic vectors"
means "it might fail for atomic vectors").
> setClass("A", representation(aa="integer"))
> a <- new("A", aa=4:1)
> length(a)
[1] 1
> is.unsorted(a)
[1] FALSE
Warning message:
In is.na(x) : is.na() applied to non-(list or vector) of type 'S4'
=> Ok, but it's arguable the warning is useful/justified from a user
point of view. The warning *seems* to suggest that defining an
"is.na" method for my objects is required for is.unsorted() to
work properly but the doc doesn't make this clear.
Anyway, let's define one, so the warning goes away:
> setMethod("is.na", "A", function(x) is.na(x at aa))
[1] "is.na"
Let's define a "length" method:
> setMethod("length", "A", function(x) length(x at aa))
[1] "length"
> length(a)
[1] 4
> is.unsorted(a)
[1] FALSE
=> Is this correct? Hard to know. The doc is not clear about what
should happen for objects of length 2 or more and with a class
but with no ">=" or ">" methods.
Let's define "[", ">=", and ">":
> setMethod("[", "A", function(x, i, j, ..., drop=TRUE) new("A",
aa=x at aa[i]))
[1] "["
> rev(a)
An object of class "A"
Slot "aa":
[1] 1 2 3 4
> setMethod(">=", c("A", "A"), function(e1, e2) {e1 at aa >= e2 at aa})
[1] ">="
> a >= a[3]
[1] TRUE TRUE TRUE FALSE
> setMethod(">", c("A", "A"), function(e1, e2) {e1 at aa > e2 at aa})
[1] ">"
> a > a[3]
[1] TRUE TRUE FALSE FALSE
> is.unsorted(a)
[1] FALSE
> is.unsorted(rev(a))
[1] FALSE
Still not working as expected. So what's required exactly for making
is.unsorted() work on an object "with a class"?
BTW, is.unsorted() would be *much* faster, at least on atomic vectors,
without those calls to is.na(). The C code could check for NAs, without
having to do this as a first pass on the full vector like it is the
case with the current implementation. If the vector if unsorted, the
C code is typically able to bail out early so the speed-up will
typically be 10000x or more if the vector as millions of elements.
Thanks,
H.
> sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.0.0
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list