[Rd] Error: invalid multibyte string
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Thu Oct 26 18:43:45 CEST 2006
Thomas Lumley <tlumley at u.washington.edu> writes:
> On Thu, 26 Oct 2006, Henrik Bengtsson wrote:
>
> > I'm observing the following on different platforms:
> >
> >> parse(text='"\\x7F"')
> > expression("\177")
> >> parse(text='"\\x80"')
> > Error: invalid multibyte string
>
> Yes. It's an invalid multibyte string. In UTF-8 a single byte is a valid
> character string only if it is below x80, so x7F is fine but x80 is not.
> In fact x80 is not the leading byte of any valid UTF-8 character.
>
> You have to work out what the Unicode code point is for whatever character
> you were expecting to be x80 and convert that to UTF-8.
>
> I'm surprised that one of your UTF-8 machines worked -- I don't think it
> should.
Interestingly, we can parse, but not print or deparse:
> x<-parse(text='"\\x80"')
> x
Error: invalid multibyte string
> z <- deparse(x)
Error in deparse(x) : invalid multibyte string
> cat(x[[1]])
�>
(the last line has a funny little cedilla-like symbol in pos 1)
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list