[R] Unicode characters (R 2.7.0 on Windows XP SP3 and Hardy Heron)

Duncan Murdoch murdoch at stats.uwo.ca
Fri May 30 19:36:14 CEST 2008


On 5/30/2008 12:58 PM, Hans-Jörg Bibiko wrote:
> Hi,
> 
> to put it simply. Windows cannot handle utf-8 data. There is no utf-8  
> locale available.

Code page 65001 is utf-8.  Most text editors (including Notepad) include 
an option to save in the UTF-8 encoding.

Some programs don't fully support utf-8 (some don't even support the 
native UCS-2), but most don't care.  That's the nice thing about utf-8.

So in what sense can Windows not handle utf-8 data?

Duncan Murdoch

> If your corpus only contains Russian data, maybe English glosses etc.  
> you can try to set lang of Rgui.exe to Russian.
> Then at least you can use grep, strsplit because they are depending  
> on the locales chosen.
> 
> 
> On 30.05.2008, at 17:14, Stefan Th. Gries wrote:
> 
> 
>> # I can do that all on Linux, but this arises in a context where
>> # many other character processing issues are explained for Mac,
>> # Linux, *and* Windows, and I'd hate to have to say "this one
>> # thing, you can't do on Windows"
>>
> Unfortunately I have to say this quite often :)
> 
> Cheers,
> 
> --Hans
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list