[Rd] Embedded nuls in strings
Herve Pages
hpages at fhcrc.org
Tue Aug 7 23:06:56 CEST 2007
Hi,
?rawToChar
'rawToChar' converts raw bytes either to a single character string
or a character vector of single bytes. (Note that a single
character string could contain embedded nuls.)
Allowing embedded nuls in a string might be an interesting experiment but it
seems to cause some troubles to most of the string manipulation functions.
A string with an embedded 0:
raw0 <- as.raw(c(65:68, 0 , 70))
string0 <- rawToChar(raw0)
> string0
[1] "ABCD\0F"
nchar() should return 6:
> nchar(string0)
[1] 4
In addition this embedded nul seems to break almost all string manipulation/searching
functions:
grep("F", string0)
strsplit(string0, split=NULL, fixed=TRUE)[[1]]
tolower(string0)
chartr("F", "x", string0)
substr(string0, 6, 6)
...
etc...
Not very surprisingly, they all seem to treat string0 as if it was "ABCD"!
Cheers,
H.
More information about the R-devel
mailing list