[Rd] Error: invalid multibyte string
Thomas Lumley
tlumley at u.washington.edu
Fri Oct 27 17:34:18 CEST 2006
On Fri, 27 Oct 2006, Henrik Bengtsson wrote:
> In Section "Package subdirectories" in "Writing R Extensions" [2.4.0
> (2006-10-10)] it says:
>
> "Only ASCII characters (and the control characters tab, formfeed, LF
> and CR) should be used in code files. Other characters are accepted in
> comments, but then the comments may not be readable in e.g. a UTF-8
> locale. Non-ASCII characters in object names will normally [1] fail
> when the package is installed. Any byte will be allowed [2] in a
> quoted character string (but \uxxxx escapes should not be used), but
> non-ASCII character strings may not be usable in some locales and may
> display incorrectly in others.", where the footnote [2] reads "It is
> good practice to encode them as octal or hex escape sequences".
>
> (Note: ASCII refers (correctly) to the 7-bit ASCII [0-127] and none of
> the 8-bit ASCII extensions [128-255].)
>
> According to sentense about quoted strings, the following R/*.R code
> should still be valid:
>
> pads <- sapply(0:64, FUN=function(x) paste(rep("\xFF", x), collapse=""));
That looks like it should be valid (at least according to the
documentation), even though it won't run usefully on UTF-F locales. What
you wrote before was:
>> > On Thu, 26 Oct 2006, Henrik Bengtsson wrote:
>> >
>> > > I'm observing the following on different platforms:
>> > >
>> > >> parse(text='"\\x7F"')
>> > > expression("\177")
>> > >> parse(text='"\\x80"')
>> > > Error: invalid multibyte string
and that error *is* correct behaviour -- you can't parse() something that
isn't a valid character string.
-thomas
More information about the R-devel
mailing list