[Rd] Error: invalid multibyte string

Henrik Bengtsson hb at stat.berkeley.edu
Thu Oct 26 04:31:22 CEST 2006


I'm observing the following on different platforms:

> parse(text='"\\x7F"')
expression("\177")
> parse(text='"\\x80"')
Error: invalid multibyte string
...
> parse(text='"\\xFF"')
Error: invalid multibyte string

However,

cat("\x7F\n\x80\n...\xFF\n")

works.  Using R --vanilla.

SYSTEMS GIVING THE ERROR:
> sessionInfo()
R version 2.4.0 (2006-10-03)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;LC_MONETARY=en_AU.UTF-8;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=en_AU.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_AU.UTF-8;LC_IDENTIFICATION=C

R version 2.4.0 Patched (2006-10-03 r39576)
i686-pc-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C


SYSTEMS OK:
R version 2.4.0 Under development (unstable) (2006-07-23 r38687)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

R version 2.4.0 (2006-10-03)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

R version 2.4.0 Patched (2006-10-10 r39600)
i386-pc-mingw32
locale:
LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=En
glish_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252

Version 2.3.0 (2006-04-24)
x86_64-unknown-linux-gnu
locale: <not reported>


All of the above have the following packages attached:
[1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
[7] "base"

We identified this problem because R CMD check complained:

> * checking package dependencies ... WARNING
> Error in deparse(e[[2]]) : invalid multibyte string
> Execution halted

because we use "\xFF" (or "\377") in the source code to be used as a
terminator in a vector buffer; "\0" can't be used for other reasons.

Is the above a bug in R or one in my head?

/H




More information about the R-devel mailing list