[Rd] bug in rank(), order(), is.unsorted() on character vector

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Wed Dec 7 19:34:16 CET 2011


2011/12/7 Joris Meys <jorismeys at gmail.com>:
> @Barry : regardless of whether '_' comes before or after '1' , it
> should be consistent. Adding an 'a' shouldn't shift '_' from before
> '1' to between '1' and '2', that's clearly an error. The help files
> are not stating anything about that.

 That's an assumption. The help pages are quite clear about making assumptions.

 The only way this could be a 'bug' is if you can show that the sort
order in R is different from the lexicographic sort order using the
collating sequence of the locale in use. But even my command line
'sort' agrees:

$ sort < f1.txt
_1_
1_9
2_9

 now add the trailing a:

$ sort < f1.txt
1_9a
_1_a
2_9a

[ I had a thought maybe it was because _ is sometimes used to break
thousands in numeric formats, but I can't get any obvious consistency
out of that hypothesis ]

Barry



More information about the R-devel mailing list