[Rd] UTF-8 and .Rd files

Peter Dalgaard p.dalgaard at biostat.ku.dk
Tue Jun 27 23:11:57 CEST 2006

"Göran Broström" <goran.brostrom at gmail.com> writes:

> Seriously, I thoght that unicode and utf8 would make problems like
> this disappear, but obviously we may have to wait another 30 years.
> Thanks for all the input.
> George

Well, I do tend to think that we should just use utf, assuming that
people have the relevant glyphs. If they don't, then they might get
little hollow rectangles but so what? (This entails stamping out the
use of iso-8859-? which I think I have previously pointed out as the
historical mistake. Easier said than done, though, especially since
8859-1, er, -15 managed to get established as a de facto standard
in a couple of key places like HTTP and NNTP.)

Transliterations are really abominable and completely ambiguous, e.g.
oe means o-umlaut in Swedish and German, but o-slash in Danish and
Norwegian, and we already have at least two interpretations of "roer"
where oe represents two distinct vowels...


   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

More information about the R-devel mailing list