[Rd] Unicode whitespace
hadley wickham
h.wickham at gmail.com
Fri Jan 4 19:13:15 CET 2008
It would be nice if R ignored more unicode white space characters.
For example, if I have "\u2028" in a command (which I get from a
line-break in keynote) I get the following error:
> qplot(carat, price, data = diamonds,
colour=clarity)
Error: unexpected input in "qplot(carat, price, data = diamonds, ?"
And occasionally have such problems when copying and pasting from
emails as well.
Wikipedia lists the following codepoints as whitespace (I'm sure there
is a more definitive reference but I could not find one with some
quick googling):
U0009-U000D (Control characters, containing TAB, CR and LF)
U0020 SPACE
U0085 NEL
U00A0 NBSP
U1680 OGHAM SPACE MARK
U180E MONGOLIAN VOWEL SEPARATOR
U2000-U200A (different sorts of spaces)
U2028 LSP
U2029 PSP
U202F NARROW NBSP
U205F MEDIUM MATHEMATICAL SPACE
U3000 IDEOGRAPHIC SPACE
would it be possible for R to treat these all in the same way? (Or
does it already but my R is misconfigured?)
Hadley
--
http://had.co.nz/
More information about the R-devel
mailing list