[Rd] Want non-ASCII characters in data package
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sat Jan 1 11:58:16 CET 2011
On Wed, 29 Dec 2010, Kevin R. Coombes wrote:
> Hi,
>
> I have a data frame that includes several names that (if typeset correctly)
> require accented characters not available in the ASCII character set.
>
> I'd like to include this data frame as example data in an R package. I'd
> also like the R CMD check warning about the use of non-ASCII characters to go
> away, in part so I could submit the package somewhere that wouldn't balk at
> the presence of the warning. (I gather from older posts that there are
> environment variables to skip this check. Those will work for me personally
> but will not necessarily appease the maintainers of sites like CRAN where I
> might want to submit the package.)
>
> Is there any way to use the correctly accented characters by setting a
> different character encoding or equivalent for the data frame? Or am I forced
> to remove the offending accents in order to be ASCII-pure and thus leave
> people and places with an incorrect representation of their names?
The latter is inevitable. There is no encoding that will work
correctly for everyone (see 'Writing R Extensions' §1.7.1): e.g.
Chinese Windows users have only ASCII and Chinese characters (and only
one of two sets of Chinese characters). Again, good practice and
compromises are discussed in 'Writing R Extensions' -- these days
using UTF-8 will do a good job for most R users.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list