[Rd] NA in character vectors
David Brahm
brahm@alum.mit.edu
Fri, 15 Mar 2002 17:30:20 -0500
Hi all,
While R-1.5.0 is still unfrozen, I'd like to try again to generate interest in
my favorite pet peeve: NA's in character vectors. Last October I wrote:
> LETTERS[c(NA,2)] in S is c("","B"), but in R is c("NA","B").
We had an interesting discussion then, and I learned (from Duncan Murdoch and
Thomas Lumley) that R does have an internal code for missing char values
(R_NaString), but it gets easily confused with the string "NA". Check this:
R> z <- c(LETTERS[c(2,NA)], "NA", paste("NA"))
R> is.na(z)
[1] FALSE TRUE TRUE FALSE
R> z[3]==z[4]
[1] TRUE
R> z=="NA"
[1] FALSE TRUE TRUE TRUE
Thomas Lumley <tlumley@u.washington.edu> suggested that this weird behavior
essentially arises because the parser converts "NA" to R_NaString, and so...
> It looks like we basically need to
> 1) stop the parser generating R_NaString from \"NA\"
> 2) Change PRINTNAME(R_NaString) to avoid ambiguity
While I still like the simple S-Plus model (""), Thomas's suggestions (with
PRINTNAME(R_NaString) = "<NA>" for example) would be OK too. Thanks for
listening (again)!
--
-- David Brahm (brahm@alum.mit.edu)
"I write this letter not to nag nor to whine, but to prod." - Lisa Simpson
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._