[R-SIG-Mac] Solution to collation problems on Mac OS X

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Dec 28 12:51:34 CET 2008


On Sun, 28 Dec 2008, [Ricardo Rodriguez] Your XEN ICT Team wrote:

> Thanks, Brian and Simon.
>
> Sorry if this question is a too basic: I'm trying to understand how does the 
> development cycle or R work. I am running here R 2.8.0 GUI 1.26 (5256) on Mac 
> OS X 10.5.6. The update to 2.8.1 released on 2008/12/22 is a pending task.
>
> When this option/feature will be included in a dmg available at 
> http://cran.r-project.org/bin/macosx/?

At R 2.9.0 (scheduled for April 2009), or if Simon so chooses at R 2.8.2 
(if there is such a version).

There is nothing to stop you building R from the sources yourself, and 
there are nightly builds at r.research.att.com (I don't know if 
R-2-8-branch builds will use --with-ICU in future: that's up to Simon).

>
> Thanks you so much for your work,
>
> Ricardo
>
> Prof Brian Ripley wrote:
>> Some of you will be aware that R ignores locale when collating strings on 
>> Mac OS X: this arises from its inadequate FreeBSD-based wcscoll, whose man 
>> page says
>> 
>> BUGS
>>      The current implementation of wcscoll() only works in single-byte
>>      LC_CTYPE locales, and falls back to using wcscmp() in locales with
>>      extended character sets.
>> 
>> (and conventional Mac OS X locales are not 'single-byte' but UTF-8).
>> 
>> Apple ships a modified version of ICU (IInternational Components for 
>> Unicode) for collation in its ObjC classes, and with Simon's help I have 
>> added code to allow R to use this on Tiger and Leopard.  This is now the 
>> default in R-devel, and available in R-patched by configuring R with 
>> --with-ICU.
>> 
>> This originally came up for European Spanish, so in the es_ES locale:
>> 
>>> example(Comparison)
>> ...
>> mprsn> ## by number
>> Cmprsn> writeLines(strwrap(paste(x, collapse=" "), width = 60))
>> ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = >
>> ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \
>> ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z
>> { | } ~   ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ­ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹
>> º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö ×
>> Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ
>> ö ÷ ø ù ú û ü ý þ ÿ
>> 
>> Cmprsn> ## by locale collation
>> Cmprsn> writeLines(strwrap(paste(sort(x), collapse=" "), width = 60))
>>   ` ´ ^ ¯ ¨ ¸ _ ­ - , ; : ! ¡ ? ¿ . · ' " « » ( ) [ ] { } §
>> ¶ © ® @ * / \ & # % ° + ± ÷ × < = > ¬ | ¦ ~ ¤ ¢ $ £ ¥ 0 1 ¹
>> ½ ¼ 2 ² 3 ³ ¾ 4 5 6 7 8 9 a A ª á Á à À â Â å Å ä Ä ã Ã æ Æ
>> b B c C ç Ç d D ð Ð e E é É è È ê Ê ë Ë f F g G h H i I í Í
>> ì Ì î Î ï Ï j J k K l L m M n N ñ Ñ o O º ó Ó ò Ò ô Ô ö Ö õ
>> Õ ø Ø p P q Q r R s S ß t T u U ú Ú ù Ù û Û ü Ü v V w W x X
>> y Y ý Ý ÿ z Z þ Þ µ
>> 
>> 
>> ------------------------------------------------------------------------
>> 
>> _______________________________________________
>> R-SIG-Mac mailing list
>> R-SIG-Mac at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>> 
>
>
> -- 
> Ricardo Rodríguez
> Your XEN ICT Team
>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-SIG-Mac mailing list