[R] UTF-8 or Unicode on Windows PC
Hans-Joerg Bibiko
bibiko at eva.mpg.de
Tue Apr 22 16:49:05 CEST 2008
On 21 Apr 2008, at 12:33, Prof Brian Ripley wrote:
>> Is it possible to download a compiled snapshot of 2.7.0 for Windows
>> XP?
> Yes, http://cran.r-project.org/bin/windows/base/rtest.html
> And it is due for release tomorrow.
I played with 2.7.0 on Windows XP. I can do things which couldn't be
done with 2.6.x. Many many thanks for the effort!!!
But, I always came to a point where I didn't find a solution, due to
the fact that Windows has no UTF-8 locale(s).
Has Windows Vista UTF-8 locales?
If I'm dealing with known languages I'm able to get rid of a lot of
things.
But my/our problem is that we have to deal with different languages at
the same time [in a data.frame]. Furthermore I/we have to deal with
IPA symbols, which haven't a locale; and grep, strsplit, etc. are set
up on top of the chosen locale. Thus I'm not able to use strsplit on a
string which contains German, Russian, IPA-symbols, because all glyphs
which are not part of the chosen locale are displayed [e.g. as output
of strsplit()] literally as <U+XXXX>.
That's why the only solution is to use an UTF-8 environment (OS) or
for hard-liners to transform each glyph into numbers and to do
research on that numbers (which is really annoying ;).
Unfortunately at this point I have to give up. Maybe there is someone
who can give me further advice with Windows.
The only thing, maybe, I have in mind is to use Perl, Python etc. in
beforehand to manipulate the data before the data are analyzed using R.
--Hans
More information about the R-help
mailing list