[Rd] Error in substring: invalid multibyte string
Toby Hocking
tdhock5 @end|ng |rom gm@||@com
Sat Jun 27 00:57:06 CEST 2020
Hi all,
I'm getting the following error from substring:
> substr("<I>Jens Oehlschl\xe4gel-Akiyoshi", 1, 100)
Error in substr("<I>Jens Oehlschl\xe4gel-Akiyoshi", 1, 100) :
invalid multibyte string at '<e4>gel-A<6b>iyoshi'
Is that normal / intended? I've tried setting the Encoding/locale to
Latin-1/UTF-8 but that does not help. nchar gives me something similar
> nchar("<I>Jens Oehlschl\xe4gel-Akiyoshi")
Error in nchar("<I>Jens Oehlschl\xe4gel-Akiyoshi") :
invalid multibyte string, element 1
I find it strange that substr/nchar give an error but regexpr works for
telling me the length:
> regexpr(".*", "<I>Jens Oehlschl\xe4gel-Akiyoshi")
[1] 1
attr(,"match.length")
[1] 29
Is that inconsistency normal/intended?
btw this example comes from our very own list:
> readLines("
https://stat.ethz.ch/pipermail/r-devel/1999-November/author.html")[28]
[1] "<I>Jens Oehlschl\xe4gel-Akiyoshi"
Best,
Toby
[[alternative HTML version deleted]]
More information about the R-devel
mailing list