[Rd] nchar reporting wrong width when zero-space character is present?
Mark van der Loo
mark.vanderloo at gmail.com
Wed Nov 19 10:58:41 CET 2014
If I include the zero-width non-breaking space (\ufeff) in a string,
nchar seems to compute the wrong number of columns used by 'cat'.
> x <- "f\ufeffoo"
I would expect "3" here. Going through the documentation of 'Encoding'
and 'encodeString', I don't think this is expected behavior. Am I
missing something? If it is a bug I will file a report.
Secondly, the documentation of 'nchars' states that with type='chars'
(the default) it returns "the number of human-readable characters". I
I would hardly call the zero-width space human-readable. Also, since for example
it is probably more accurate to say that the number of symbols
(abstract characters) are counted, noting that some of the symbols in
an alphabet represented by an encoding may be invisible (or hardly
Much thanks in advance,
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
 LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8
 LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8
 LC_PAPER=nl_NL.UTF-8 LC_NAME=C
 LC_ADDRESS=C LC_TELEPHONE=C
 LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C
attached base packages:
 stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
More information about the R-devel