[Rd] collation order

Thomas Lumley tlumley at u.washington.edu
Fri Mar 17 22:32:32 CET 2006


The following caused a hard-to-diagnose problem for a user of the survey 
package.  Presumably this is a strange Unicode thing, but is there a 
convenient reference for how the collation order is determined? I am 
surprised that adding the same character to the end of two strings of the 
same length can change the sorting order.

in en_US.utf8 locale
> "1//"<"10/"
[1] TRUE
> "1//2"<"10/2"
[1] FALSE

in C locale on the same system.
> "1//"<"10/"
[1] TRUE
> "1//2"<"10/2"
[1] TRUE

[This is in r-devel of March 6, but the problem that was reported to me 
involved Windows vs Linux on released versions]

 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-devel mailing list