[R] Characters vectors, NA's and "" in merges
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Sep 26 14:41:11 CEST 2001
On Wed, 26 Sep 2001, David Kane <David Kane wrote:
> I often use merge with dataframes that contain character vectors which have
> elements that are sometimes "NA" (meaning the string NA, not the same thing,
> obviously, as NA in a numeric or factor vector). For example, the stock ticker
> for Nabisco was "NA". Unfortunately (for me), it seems like merge insists on
> inserting "NA" for missing values. My question: Is there some way around this?
> Here is a simple example:
>
> > version
> _
> platform sparc-sun-solaris2.6
> arch sparc
> os solaris2.6
> system sparc, solaris2.6
> status
> major 1
> minor 3.0
> year 2001
> month 06
> day 22
> language R
>
> > a <- data.frame(x = 1:4)
> > b <- data.frame(x = 1:3, y = c("NA", "a", "b"))
Take a look. b$y is a factor with levels "a" and "b", and a missing first
value.
> > merge(a, b, all.x = TRUE)
> x y
> 1 1 NA
> 2 2 a
> 3 3 b
> 4 4 NA
>
> Rows 1:3 are what I expect them to be. Row 4 is "wrong" in the sense that
> dataframe b did not contain a row for x = 4. Of course, there is a sense that
> *any* value, including "", that is placed in row 4 is potentially
> misleading. Perhaps I am misunderstanding the meaning of "NA" in a character
> vector (i.e., I am not allowed to have "real" values that are that string).
That is the correct answer. Because you asked for all.x=TRUE, you
got a missing value there in row 4 col 2.
> If there were some way (an "nomatch" argument?) that the user could specify
> what missing values are used for character strings, then I would be
> fine. Again, I suspect that my real problem is not understanding how to specify
> "NA" -- meaning Nabisco's ticker symbol -- in a character vector.
You cannot avoid it being taken as the missing value, AFAIK.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list