[R] Wich character coding for source under Windows?
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Jan 9 12:46:47 CET 2004
As I said, Rterm/Rgui do no encoding. If you use cat or sink, the exact
numeric char you used is written out. Maybe if you *display* it you see
something different, but I have already explained that.
Unless you do octal/hex dumps on files you will be confused by display
encodings.
On Fri, 9 Jan 2004, Philippe Grosjean wrote:
> OK, now with these infos and some experiment, it appears that the ANSI
> encoding is used by default under Windows for source(), sink(), etc...
No, it is the native encoding. There is no `ANSI' encoding, but your
machine is probably set up to use WinANSI (not ANSI).
> That is, if I understand correctly:
>
> - source() that uses parse(file= ) is assuming nothing, because it just
> reads bytes and the S language uses only characters among the first 128
> ones, which are the same in ANSI or DOS encoding.
Not true: S can use 8-bit characters.
> - sink() is consistent with this behaviour *under RGUI* and uses ANSI, as
> does the default encoding for connections() with getOption("encoding) ==
> 0:255 assumes the same as does sink()
>
> Now, my problem comes with Rterm... as it is a console program that uses DOS
> encoding under Windows. So, with Rterm, there is a "translation" of the ANSI
> characters sourced from a text file into DOS characters (for instance, those
> in a cat(".....") instruction... and the reverse with sink(). Is this
> inconsistent behaviour between Rgui and Rterm purposedly decided for some
> reasons? Or is it just a consequence of the inconsistence between window
> programs (Rgui) and command line programs (Rterm) under Windows?
>
> Anyway, how could I use characters encoded over the 128th position in a
> character string with source(), sink(), cat(), etc... and get the same
> behaviour between Rgui and Rterm? Also, I suppose I would have problems with
> such characters in Unix/Linux and MacOS, which would interpret them
> differently?
You *do* get the same behaviour. If you do example(text) you get the same
chars in RGui and Rterm, even if
options(pager="console")
help(text)
displays them differently. That is nothing to do with Rterm, though.
And if you want to transfer files from Windows to another OS, you have to
tell R on that OS what encoding you used. That is all.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list