[R] Puzzled by results from base::rank()

Ebert,Timothy Aaron tebert @end|ng |rom u||@edu
Fri Aug 11 14:32:06 CEST 2023


I have entered values into Excel, and sorted them. I am assuming you are asking why the value 3 in x2 is ranked 4.5 versus in x5 it has a rank of 5.
X2 looks like this
Value   Rank    Order
1       1.5     1
1       1.5     2
2       3       3
3       4.5     4
3       4.5     5
4       6       6
5       8       7
5       8       8
5       8       9
6       10      10
9       11      11

The average of 4 and 5 is 4.5.

For x3 we have:

Value   Rank    Order
1       1.5     1
1       1.5     2
2       3       3
3       5       4
3       5       5
3       5       6
4       7       7
5       9       8
5       9       9
5       9       10
6       11      11
9       12      12

The ranks of the threes are 4, 5, and 6 and the average is 5.
For any set of values adding one value that is the same as an existing value will always increase the rank of that value. It has not been rounded up, though it may look that way in the example. If you add another 3 to the data the rank will increase to 5.5, and adding another three will give a rank of 6. Each additional 3 will boost the rank by 0.5.

You can get a different result if you change a value. If there is a mistake in the data and I discover that the second 1 in x2 should be a 3, then the rank for 3 is 4 and it looks like I have rounded down. If the mistake happened for a value greater than 3 then it would again look like I had rounded up. However, the appearance of "rounding" is an illusion easily seen through if you expand your example to generalize the outcome.



Tim

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Gerrit Eichner
Sent: Friday, August 11, 2023 4:32 AM
To: r-help using r-project.org
Subject: Re: [R] Puzzled by results from base::rank()

[External Email]

Dear Chris,

the members of the triplet would be ranked 4, 5 and 6 (in your example), so the *mean of their ranks* is correctly 5.

For any set of k tied values the ranks of its elements are averaged (and assigned to each of its k members).

  Hth  --  Gerrit

---------------------------------------------------------------------
Dr. Gerrit Eichner                   Mathematical Institute, Room 215
gerrit.eichner using math.uni-giessen.de   Justus-Liebig-University Giessen
Tel: +49-(0)641-99-32104          Arndtstr. 2, 35392 Giessen, Germany
http://www.uni-giessen.de/eichner
---------------------------------------------------------------------

Am 11.08.2023 um 09:54 schrieb Chris Evans via R-help:
> I understand that the default ties.method is "average".  Here is what
> I get, expanding a bit on the help page example. Running R 4.3.1 on
> Ubuntu 22.04.2.
>
>  > x2 <- c(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5)  > rank(x2)
>   [1]  4.5  1.5  6.0  1.5  8.0 11.0  3.0 10.0  8.0  4.5  8.0
>
> OK so the ties, each of with two members, are ranked to their mean.
>
> So now I turn one tie from a twin to a triplet:
>
>  > x3 <- c(x2, 3)
>  > rank(x3)
>   [1]  5.0  1.5  7.0  1.5  9.0 12.0  3.0 11.0  9.0  5.0  9.0  5.0  >
> sprintf("%4.3f", rank(x3))
>   [1] "5.000"  "1.500"  "7.000"  "1.500"  "9.000"  "12.000" "3.000"
> "11.000"
>   [9] "9.000"  "5.000"  "9.000"  "5.000"
>
> The doublet is still given the mean of the values but the triplet is
> rounded up.  What am I missing here?!
>
> TIA,
>
> Chris
>

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list