[R] Wich character coding for source under Windows?

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri Jan 9 12:46:47 CET 2004


As I said, Rterm/Rgui do no encoding.  If you use cat or sink, the exact 
numeric char you used is written out.  Maybe if you *display* it you see 
something different, but I have already explained that.

Unless you do octal/hex dumps on files you will be confused by display 
encodings.

On Fri, 9 Jan 2004, Philippe Grosjean wrote:

> OK, now with these infos and some experiment, it appears that the ANSI
> encoding is used by default under Windows for source(), sink(), etc...

No, it is the native encoding.  There is no `ANSI' encoding, but your 
machine is probably set up to use WinANSI (not ANSI).

> That is, if I understand correctly:
> 
> - source() that uses parse(file= ) is assuming nothing, because it just
> reads bytes and the S language uses only characters among the first 128
> ones, which are the same in ANSI or DOS encoding.

Not true: S can use 8-bit characters.

> - sink() is consistent with this behaviour *under RGUI* and uses ANSI, as
> does the default encoding for connections() with getOption("encoding) ==
> 0:255 assumes the same as does sink()
> 
> Now, my problem comes with Rterm... as it is a console program that uses DOS
> encoding under Windows. So, with Rterm, there is a "translation" of the ANSI
> characters sourced from a text file into DOS characters (for instance, those
> in a cat(".....") instruction... and the reverse with sink(). Is this
> inconsistent behaviour between Rgui and Rterm purposedly decided for some
> reasons? Or is it just a consequence of the inconsistence between window
> programs (Rgui) and command line programs (Rterm) under Windows?
> 
> Anyway, how could I use characters encoded over the 128th position in a
> character string with source(), sink(), cat(), etc... and get the same
> behaviour between Rgui and Rterm? Also, I suppose I would have problems with
> such characters in Unix/Linux and MacOS, which would interpret them
> differently?

You *do* get the same behaviour.  If you do example(text) you get the same 
chars in RGui and Rterm, even if 

options(pager="console")
help(text)

displays them differently.  That is nothing to do with Rterm, though.

And if you want to transfer files from Windows to another OS, you have to 
tell R on that OS what encoding you used.  That is all.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list