[Rd] Bug in rank with utf8?

John McKown john.archie.mckown at gmail.com
Thu Aug 13 16:01:16 CEST 2015


2015-08-13 8:39 GMT-05:00 Hadley Wickham <h.wickham at gmail.com>:

> x <- "\u0663"
> y <- 3
>
> x == y
> # FALSE
> rank(c(x, y))
> # c(1.5, 1.5)
>

​also interesting, and confusing to me:

> x == y
[1] FALSE
> x > y
[1] FALSE
> x < y
[1] FALSE
>

With some slight changes:

> x <- "\u0663"
> y <- "3"
> xy <- c(x,y)
> rank(xy);
[1] 1.5 1.5
> Sys.getlocale();
[1]
"LC_CTYPE=en_US.UTF8;LC_NUMERIC=C;LC_TIME=en_US.UTF8;LC_COLLATE=en_US.UTF8;LC_MONETARY=en_US.UTF8;LC_MESSAGES=en_US.UTF8;LC_PAPER=en_US.UTF8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF8;LC_IDENTIFICATION=C"
> Sys.setlocale(category="LC_COLLATE", locale="C");
[1] "C"
> rank(xy);
[1] 2 1
>



> --
> http://had.co.nz/
>
>
-- 

Schrodinger's backup: The condition of any backup is unknown until a
restore is attempted.

Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

	[[alternative HTML version deleted]]



More information about the R-devel mailing list