[R] Strange result from sort: sort(c("aa", "ff")) gives "ff" "aa" with R.2.12.1 on windows 7

Søren Højsgaard Soren.Hojsgaard at agrsci.dk
Mon Jan 24 22:44:35 CET 2011


Dear list,

Please consider the following call of sort

> sort(c("a","f"))
[1] "a" "f"
> sort(c("f","a"))
[1] "a" "f"
>
> sort(c("aa","ff"))
[1] "ff" "aa"
> sort(c("ff","aa"))
[1] "ff" "aa"
The last two results look strange to me. Is that a bug???

The result seems to come from calls to order:

> order(c("a","f"))
[1] 1 2
> order(c("f","a"))
[1] 2 1
>
> order(c("aa","ff"))
[1] 2 1
> order(c("ff","aa"))
[1] 1 2
I get the same results on R.2.12.1, R.2.11.1 and R.2.13.0 on Windows 7. However on Linux, I get the "right answer" (the answer I expected). From the help pages I get the impression that there might be an issue about locale, but I didn't understand the details.

Can anyone tell me what goes on here, please

Regards
Søren






> sessionInfo()
R version 2.12.1 Patched (2010-12-27 r53883)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=Danish_Denmark.1252  LC_CTYPE=Danish_Denmark.1252
[3] LC_MONETARY=Danish_Denmark.1252 LC_NUMERIC=C
[5] LC_TIME=Danish_Denmark.1252
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
other attached packages:
[1] SHDtools_1.0


> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: i686-pc-linux-gnu (32-bit)
locale:
 [1] LC_CTYPE=en_DK.utf8       LC_NUMERIC=C
 [3] LC_TIME=en_DK.utf8        LC_COLLATE=en_DK.utf8
 [5] LC_MONETARY=C             LC_MESSAGES=en_DK.utf8
 [7] LC_PAPER=en_DK.utf8       LC_NAME=C
 [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_DK.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base



More information about the R-help mailing list