[Rd] Bug: as.matrix.data.frame() treats numeric vectors with "levels" attribute as factors
David Skalinder
@k@||nder @end|ng |rom w|@c@edu
Fri Mar 1 23:04:27 CET 2019
Hello,
I think I've found a bug in as.matrix.data.frame(). The function's
documentation says: "The method for data frames will return a character
matrix if there is only atomic columns and any
non-(numeric/logical/complex) column, applying as.vector to factors and
format to other non-character columns. Otherwise, the usual coercion
hierarchy (logical < integer < double < complex) will be used..."
However, when the function checks for non-numeric columns, it includes
the following check for each column xj:
length(levels(xj)) > 0L
This means that any atomic, numeric, non-factor column with a "levels"
attribute will cause as.matrix.data.frame() to return a character
matrix, not use the usual coercion hierarchy as documented. This means,
for example, that columns that are unclassed factors will unexpectedly
force as.matrix.data.frame() to return a character matrix.
To reproduce:
-----
df <- data.frame(v1 = 1:2, v2 = 3:4)
typeof(as.matrix(df)) # integer, as documented
attr(df[[1]], "levels") <- "test"
class(df[[1]]) # integer
typeof(as.matrix(df)) # character, despite all atomic, numeric,
non-factor cols
df2 <- data.frame(v1 = unclass(factor(c("a", "b"))), v2 = 1:2)
typeof(as.matrix(df2)) # character, despite unclassing factor
attr(df2[[1]], "levels") <- NULL
typeof(as.matrix(df2)) # integer, even though no types changed
-----
I can reproduce this in 3.5.1 and 3.5.2, and I can't see anything
related in the upcoming changes or in Bugzilla, so I thought I'd report
it here. I don't know what the cleanest fix will be, but it seems that
either the function or the documentation should be changed so that they
align.
Please let me know if you need any additional info!
Thanks
David
More information about the R-devel
mailing list