[Rd] Non-ASCII chars in R code
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed May 17 20:40:08 CEST 2006
The report on R_help about problems loading package irr (in a
UTF-8 locale, it seemed) prompted me to look a little deeper. There are
quite a few packages with Latin-1 chars in their .R files, and a couple in
Apart from non-ASCII chars in comments, this is a problem as the code
concerned cannot be represented in some locales R runs in (for example
Japanese on Windows). It happens that irr is so small that lazy-loading
is not used, but when lazy-loading or a saved image is used, the locale in
use when the package is installed determines how the code is parsed (and
may not be the same as when the package is used, and indeed it is not
uncommon on Linux/Unix systems for different users to use different
This means that using non-ASCII chars is not portable, and I've added code
to R CMD check in R-devel to warn about such usage. In the examples I
have investigated the usages have been
- messages in a non-English language, typically French.
- startup messages with people's names.
- use of characters that I can only guess are intended to be in the
WinAnsi encoding, e.g. a copyright symbol.
The only reason I have not made this an error is that people might want to
produce packages for a known locale, e.g. a student class, but perhaps it
should be an error for packages submitted to CRAN.
I do not believe there is much we can do about this: messages which are
not entirely in ASCII cannot be displayed on many R platforms and it seems
incorrect to allow French messages and not Japanese ones.
The packages currently throwing warnings are
FactoMineR FunCluster JointGLM LoopAnalyst Sciviews ade4 adehabitat ape
climatol crossdes deal grasper irr lsa mvrpart pastecs sn surveillance
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel