[R] Characters vectors, NA's and "" in merges
David Kane <David Kane
a296180 at mica.fmr.com
Wed Sep 26 14:10:32 CEST 2001
I often use merge with dataframes that contain character vectors which have
elements that are sometimes "NA" (meaning the string NA, not the same thing,
obviously, as NA in a numeric or factor vector). For example, the stock ticker
for Nabisco was "NA". Unfortunately (for me), it seems like merge insists on
inserting "NA" for missing values. My question: Is there some way around this?
Here is a simple example:
> version
_
platform sparc-sun-solaris2.6
arch sparc
os solaris2.6
system sparc, solaris2.6
status
major 1
minor 3.0
year 2001
month 06
day 22
language R
> a <- data.frame(x = 1:4)
> b <- data.frame(x = 1:3, y = c("NA", "a", "b"))
> merge(a, b, all.x = TRUE)
x y
1 1 NA
2 2 a
3 3 b
4 4 NA
Rows 1:3 are what I expect them to be. Row 4 is "wrong" in the sense that
dataframe b did not contain a row for x = 4. Of course, there is a sense that
*any* value, including "", that is placed in row 4 is potentially
misleading. Perhaps I am misunderstanding the meaning of "NA" in a character
vector (i.e., I am not allowed to have "real" values that are that string).
If there were some way (an "nomatch" argument?) that the user could specify
what missing values are used for character strings, then I would be
fine. Again, I suspect that my real problem is not understanding how to specify
"NA" -- meaning Nabisco's ticker symbol -- in a character vector.
Any suggestions would be much appreciated.
Dave Kane
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list