[Rd] \U with more than 4 digits returns the wrong character
Richard Cotton
richierocks at gmail.com
Thu Dec 4 20:37:13 CET 2014
Great spot, thanks Mark.
This really ought to appear somewhere in the ?Quotes help page.
Having a warning under Windows might be nicer behaviour than silently
returning the wrong value too.
On 4 December 2014 at 22:24, Mark van der Loo <mark.vanderloo at gmail.com> wrote:
> Richie,
>
> The R language definition [1] says (10.3.1):
>
> \Unnnnnnnn \U{nnnnnnnn}
> (where multibyte locales are supported and not on Windows, otherwise
> an error). Unicode character with given hex code – sequences of up to
> eight hex digits.
>
>
> Best,
> Mark
>
> [1] http://cran.r-project.org/doc/manuals/r-release/R-lang.html
> http://www.markvanderloo.eu
> -------------------------------------------------------------------
> If you cannot quantify it,
> you don't know what you're talking about
>
>
> On Thu, Dec 4, 2014 at 8:00 PM, Richard Cotton <richierocks at gmail.com> wrote:
>> If I type a character using \U syntax that has more than 4 digits, I
>> get the wrong character. For example,
>>
>> "\U1d4d0"
>>
>> should print a mathematical bold script capital A. See
>> http://www.fileformat.info/info/unicode/char/1d4d0/index.htm
>>
>> On my machine, it prints the Hangul character corresponding to
>>
>> "\Ud4d0"
>> http://www.fileformat.info/info/unicode/char/d4d0/index.htm
>>
>> It seems that the hex-digit part is overflowing at 16^4.
>>
>> I tested this on R3.1.2 and devel (2014-12-03 r67101) x64 under
>> Windows. I played around with Sys.setlocale and options("encoding"),
>> but couldn't get the expected value.
>>
>> Can others reproduce this? It feels like a bug, but experience tells
>> me I probably have something silly going on with my setup.
>>
>> --
>> Regards,
>> Richie
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
--
Regards,
Richie
Learning R
4dpiecharts.com
More information about the R-devel
mailing list