[Rd] sort - Windows and Linux
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue May 27 18:55:29 CEST 2008
This sort of thing does depend on the locale, but here I get the same
answer on Windows and Linux (the order you give).
So please give us much more complete information about what locales you
used and what results you got.
?sort does say
The sort order for character vectors will depend on the collating
sequence of the locale in use: see 'Comparison'.
and ?Comparison says
Collation of
non-letters (spaces, punctuation signs, hyphens, fractions and so
on) is even more problematic.
so you have been warned.
On Tue, 27 May 2008, Yohan Chalabi wrote:
> Dear all,
>
> While debugging a function I realized that
>
> sort(c(" 1", " 2", "10"))
>
> do not give the same result on Windows and Linux.
>
> This is actually not surprising because white spaces are not handle in
> the same manner on these two platforms. But I was wondering if this
> behavior is also desired in R.
Well, do you want R to behave in the same way as other tools on your
platform, or the same way on all implementations? One cannt have both.
Currently R is using the OS's facilities, but the NEWS for R-devel says
o There is support for using ICU (International Components for
Unicode) for collation, enabled by configure option --with-ICU
on a Unix-alike and by a setting in MkRules on Windows.
Function icuSetCollate() allows the collation rules (including
the locale) to be tuned. [Experimental]
so we have been exploring alternatives (not least because the C runtime of
Mac OS X cannot sort well in UTF-8 locales).
> regards,
> Yohan Chalabi
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list