[Rd] Want non-ASCII characters in data package

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Jan 1 11:58:16 CET 2011


On Wed, 29 Dec 2010, Kevin R. Coombes wrote:

> Hi,
>
> I have a data frame that includes several names that (if typeset correctly) 
> require accented characters not available in the ASCII character set.
>
> I'd like to include this data frame as example data in an R package.  I'd 
> also like the R CMD check warning about the use of non-ASCII characters to go 
> away, in part so I could submit the package somewhere that wouldn't balk at 
> the presence of the warning.  (I gather from older posts that there are 
> environment variables to skip this check.  Those will work for me personally 
> but will not necessarily appease the maintainers of sites like CRAN where I 
> might want to submit the package.)
>
> Is there any way to use the correctly accented characters by setting a 
> different character encoding or equivalent for the data frame? Or am I forced 
> to remove the offending accents in order to be ASCII-pure and thus leave 
> people and places with an incorrect representation of their names?

The latter is inevitable.  There is no encoding that will work 
correctly for everyone (see 'Writing R Extensions' §1.7.1): e.g. 
Chinese Windows users have only ASCII and Chinese characters (and only 
one of two sets of Chinese characters).  Again, good practice and 
compromises are discussed in 'Writing R Extensions' -- these days 
using UTF-8 will do a good job for most R users.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


More information about the R-devel mailing list