[Rd] match function causing bad performance when using tablefunction on factors with multibyte characters on Windows
Karl Ove Hufthammer
karl at huftis.org
Tue Jan 25 11:49:11 CET 2011
Matthew Dowle wrote:
> I'm not sure, but note the difference in locale between
> Linux (UTF-8) and Windows (non UTF-8). As far as I
> understand it R much prefers UTF-8, which Windows doesn't
> natively support. Otherwise you could just change your
> Windows locale to a UTF-8 locale to make R happier.
>
[...]
>
> If anybody knows a way to trick R on Linux into thinking it has
> an encoding similar to Windows then I may be able to take a
> look if I can reproduce the problem in Linux.
Changing the locale to an ISO 8859-1 locale, i.e.:
export LC_ALL="en_US.ISO-8859-1"
export LANG="en_US.ISO-8859-1"
I could *not* reproduce it; that is, ‘table’ is as fast on the non-ASCII
factor as it is on the ASCII factor.
--
Karl Ove Hufthammer
More information about the R-devel
mailing list