[Rd] invalid regular expression '[a-Z]'
Henrik Bengtsson
hb at stat.berkeley.edu
Thu Mar 6 09:52:58 CET 2008
On Wed, Mar 5, 2008 at 11:09 PM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
> On Wed, 5 Mar 2008, Henrik Bengtsson wrote:
>
> > On Wed, Mar 5, 2008 at 6:18 PM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> >> On 05/03/2008 8:56 PM, Henrik Bengtsson wrote:
> >> > Hi,
> >> >
> >> > just curious, but does anyone know the source/reason of observing the
> >> > following error on OSX but not on WinXP and Linux?
> >>
> >> Presumably in the locale you're using on OSX, "a" < "Z" is false. This
> >> is the ascii sort order used in the C locale. On my Windows box, "a" <
> >> "Z" is true, because it uses the English_Canada.1252 collation order.
> >
> > That's it indeed. The person who first reported the error had
> > sessionInfo() locale
> > 'en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8' and I
> > missed that 'C' in the middle, which I guess his system falls back to
> > if none of the previous ones exist?!?
>
> No. Those are settings for various categories, just as you showed for
> Window. The first setting appears to be LC_COLLATE, but what they mean is
> not documented on the system man page for setlocale.
>
> It's just that MacOS uses C collation order in English locales, even
> though almost everyone else uses aAbB or AaBb (the latter being what the
> English actually use, as do almost all book indices in dialects of
> English). But then there is no surprise that MacOS has to be different
> ... its implementaton of locales is idiosyncratic (to be generous).
>
> Note that even [A-Za-z] is unsafe -- as I recall Z is in the middle of the
> alphabet in Estonian locales. If you want alphabetic characters, use
> [[:alpha:]]. If you want ASCII alphabetic characters, write out the
> ranges as [AB...Zab...z]
>
> E.g. (F8 Linux)
>
> > Sys.setlocale("LC_COLLATE", "et_EE.utf8")
> [1] "et_EE.utf8"
> > paste(sort(c(letters,LETTERS)), collapse="")
> [1] "AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsZzTtUuVvWwXxYy"
Alpha and Omega - you said it all.
Thanks for the clarifications.
/Henrik
>
>
> [...]
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
>
More information about the R-devel
mailing list