[R-SIG-Mac] Solution to collation problems on Mac OS X
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sun Dec 28 12:51:34 CET 2008
On Sun, 28 Dec 2008, [Ricardo Rodriguez] Your XEN ICT Team wrote:
> Thanks, Brian and Simon.
>
> Sorry if this question is a too basic: I'm trying to understand how does the
> development cycle or R work. I am running here R 2.8.0 GUI 1.26 (5256) on Mac
> OS X 10.5.6. The update to 2.8.1 released on 2008/12/22 is a pending task.
>
> When this option/feature will be included in a dmg available at
> http://cran.r-project.org/bin/macosx/?
At R 2.9.0 (scheduled for April 2009), or if Simon so chooses at R 2.8.2
(if there is such a version).
There is nothing to stop you building R from the sources yourself, and
there are nightly builds at r.research.att.com (I don't know if
R-2-8-branch builds will use --with-ICU in future: that's up to Simon).
>
> Thanks you so much for your work,
>
> Ricardo
>
> Prof Brian Ripley wrote:
>> Some of you will be aware that R ignores locale when collating strings on
>> Mac OS X: this arises from its inadequate FreeBSD-based wcscoll, whose man
>> page says
>>
>> BUGS
>> The current implementation of wcscoll() only works in single-byte
>> LC_CTYPE locales, and falls back to using wcscmp() in locales with
>> extended character sets.
>>
>> (and conventional Mac OS X locales are not 'single-byte' but UTF-8).
>>
>> Apple ships a modified version of ICU (IInternational Components for
>> Unicode) for collation in its ObjC classes, and with Simon's help I have
>> added code to allow R to use this on Tiger and Leopard. This is now the
>> default in R-devel, and available in R-patched by configuring R with
>> --with-ICU.
>>
>> This originally came up for European Spanish, so in the es_ES locale:
>>
>>> example(Comparison)
>> ...
>> mprsn> ## by number
>> Cmprsn> writeLines(strwrap(paste(x, collapse=" "), width = 60))
>> ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
>> ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \
>> ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z
>> { | } ~ ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹
>> º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö ×
>> Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ
>> ö ÷ ø ù ú û ü ý þ ÿ
>>
>> Cmprsn> ## by locale collation
>> Cmprsn> writeLines(strwrap(paste(sort(x), collapse=" "), width = 60))
>> ` ´ ^ ¯ ¨ ¸ _ - , ; : ! ¡ ? ¿ . · ' " « » ( ) [ ] { } §
>> ¶ © ® @ * / \ & # % ° + ± ÷ × < = > ¬ | ¦ ~ ¤ ¢ $ £ ¥ 0 1 ¹
>> ½ ¼ 2 ² 3 ³ ¾ 4 5 6 7 8 9 a A ª á Á à À â Â å Å ä Ä ã Ã æ Æ
>> b B c C ç Ç d D ð Ð e E é É è È ê Ê ë Ë f F g G h H i I í Í
>> ì Ì î Î ï Ï j J k K l L m M n N ñ Ñ o O º ó Ó ò Ò ô Ô ö Ö õ
>> Õ ø Ø p P q Q r R s S ß t T u U ú Ú ù Ù û Û ü Ü v V w W x X
>> y Y ý Ý ÿ z Z þ Þ µ
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> R-SIG-Mac mailing list
>> R-SIG-Mac at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>>
>
>
> --
> Ricardo Rodríguez
> Your XEN ICT Team
>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-SIG-Mac
mailing list