[Rd] (no subject)
Suharto Anggono Suharto Anggono
suharto_anggono at yahoo.com
Thu Oct 22 07:44:57 CEST 2015
------------------
>>>>> Henric Winell <[hidden email]>
>>>>> on Wed, 21 Oct 2015 13:43:02 +0200 writes:
> Den 2015-10-21 kl. 07:24, skrev Suharto Anggono Suharto Anggono via R-devel:
>> Marius Hofert-4------------------------------
>>> Den 2015-10-09 kl. 12:14, skrev Martin Maechler:
>>> I think so: the code above doesn't seem to do the right thing. Consider
>>> the following example:
>>>
>>> > x <- c(1, 1, 2, 3)
>>> > rank2(x, ties.method = "last")
>>> [1] 1 2 4 3
>>>
>>> That doesn't look right to me -- I had expected
>>>
>>> > rev(sort.list(x, decreasing = TRUE))
>>> [1] 2 1 3 4
>>>
>>
>> Indeed, well spotted, that seems to be correct.
>>
>>>
>>> Henric Winell
>>>
>> ------------------------------
>>
>> In the particular example (of length 4), what is really wanted is the following.
>> ind <- integer(4)
>> ind[sort.list(x, decreasing=TRUE)] <- 4:1
>> ind
> You don't provide the output here, but 'ind' is, of course,
>> ind
> [1] 2 1 3 4
>> The following gives the desired result:
>> sort.list(rev(sort.list(x, decreasing=TRUE)))
> And, again, no output, but
>> sort.list(rev(sort.list(x, decreasing=TRUE)))
> [1] 2 1 3 4
> Why is it necessary to use 'sort.list' on the result from
> 'rev(sort.list(...'?
You can try all kind of code on this *too* simple example and do
experiments. But let's approach this a bit more scientifically
and hence systematically:
Look at rank {the R function definition} to see that
for the case of no NA's,
rank(x, ties.method = "first') === sort.list(sort.list(x))
If you assume that to be correct and want to define "last" to be
correct as well (in the sense of being "first"-consistent),
it is clear that
rank(x, ties.method = "last) === rev(sort.list(sort.list(rev(x))))
must also be correct. I don't think that *any* of the proposals
so far had a correct version [but the too simplistic examples
did not show the problems].
In R-devel (the R development) version of today, i.e., svn
revision >= 69549, the implementation of ties.method = "last'
uses
## == rev(sort.list(sort.list(rev(x)))) :
if(length(x) == 0) integer(0)
else { i <- length(x):1L
sort.list(sort.list(x[i]))[i] },
which is equivalent to using rev() but a bit more efficient.
Martin Maechler, ETH Zurich
------------------
I'll defend that my code is correct in general.
All comes from the fact that, if p is a permutation of 1:n,
{ ind <- integer(n); ind[p] <- 1:n; ind }
gives the same result to
sort.list(p)
You can make sense of it like this. In ind[p] <- 1:n, ind[1] is the position where p == 1. So, ind[1] is the position of the smallest element of p. So, it is the first element of sort.list(p). Next elements follow.
That's why 'sort.list' is used for ties.method="first" and ties.method="random" in function 'rank' in R. When p gives the desired order,
{ ind <- integer(n); ind[p] <- 1:n; ind }
gives ranks of the original elements based on the order. The original element in position p[1] has rank 1, the original element in position p[2] has rank 2, and so on.
Now, I say that rev(sort.list(x, decreasing=TRUE)) gives the desired order for ties.method="last". With the order, the elements are from smallest to largest; for equal elements, elements are ordered by their positions backwards.
More information about the R-devel
mailing list