[Rd] bug in rank(), order(), is.unsorted() on character vector
Hervé Pagès
hpages at fhcrc.org
Thu Dec 8 19:26:48 CET 2011
Hi Barry,
Hope you don't mind if I put this back on the list.
On 11-12-08 05:50 AM, Barry Rowlingson wrote:
> 2011/12/8 Hervé Pagès<hpages at fhcrc.org>:
>
>> A naive question: wouldn't everything be simpler if LC_COLLATE=C
>> was the default for everybody?
>
> Yet when we Brits suggest everything would be simpler if the whole
> world spoke the Queen's English it causes all sorts of trouble...
:-) Sure I see your point.
But it's a programming language here, used by a lot of researchers.
And having the result of an analysis depend on a crazy collate is
causing all sorts of troubles too.
Note that trying to strike back the Empire is a lost battle anyway.
When you use R (as a user or a developer), any function name you
type (sort, rank, print, summary, etc...) is in Queen's English.
And their man pages too.
Also note that I was just talking about the *default*. AFAIK other
very serious projects like Python or SQLite *by default* use a
collating sequence that behaves like LC_COLLATE=C on strings
that contain ASCII chars only. And they let you change that if you
want. Are they being imperialist? Most R users/developers are in
research or academics where I suspect consistency and reproducibility
is even a bigger deal than in the Python or SQLite community.
Cheers,
H.
>
> Barry
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the R-devel
mailing list