merge.data.frame can coerce character vectors to factor in some circumstances (PR#1608)
a296180@agate.fmr.com
a296180@agate.fmr.com
Wed, 29 May 2002 13:10:19 +0200 (MET DST)
If the following two conditions are met:
1) all.x is TRUE
2) at least 1 row in y does not have a match in x
then any character vectors in y will be coerced to be factors. Here is a simple
example (previously provided on r-devel):
> x <- data.frame(a = 1:4)
> y <- data.frame(b = LETTERS[1:3])
> y$b <- as.character(y$b)
> z <- merge(x, y, by = 0, all.x = TRUE)
> z
Row.names a b
1 1 1 A
2 2 2 B
3 3 3 C
4 4 4 <NA>
> sapply(z, data.class)
Row.names a b
"factor" "numeric" "factor"
>
This problem could be fixed by changing the line in merge.data.frame:
for (i in seq(along = y)) is.na(y[[i]]) <- (lxy + 1):(lxy + nxx)
to:
for (i in seq(along = y)) y[((lxy + 1):(lxy + nxx)), i] <- NA
To the extent that this is a feature rather than a bug (if so, I would like to
know why), then I would suggest that the following sentence be added to the
documentation for merge at the end of the section on all.x
"Be aware that, if all.x equals `TRUE', character vectors in `y' will be
converted to factors if any rows in y have no matching row in `x'."
Thanks,
Dave Kane
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._