[Rd] scale.default gives an incorrect error message when is.numeric() fails on a dgeMatrix
Martin Maechler
maechler at stat.math.ethz.ch
Thu Mar 1 18:52:46 CET 2018
>>>>> Michael Chirico <michaelchirico4 at gmail.com>
>>>>> on Tue, 27 Feb 2018 20:18:34 +0800 writes:
Slightly amended 'Subject': (unimportant mistake: a dgeMatrix is *not* sparse)
MM: modified to commented R code, slightly changed from your post:
## I am attempting to use the lars package with a sparse input feature matrix,
## but the following fails:
library(Matrix)
library(lars)
data(diabetes) # from 'lars'
##UAagghh! not like this -- both attach() *and* as.data.frame() are horrific!
##UA attach(diabetes)
##UA x = as(as.matrix(as.data.frame(x)), 'dgCMatrix')
x <- as(unclass(diabetes$x), "dgCMatrix")
lars(x, y, intercept = FALSE)
## Error in scale.default(x, FALSE, normx) :
## length of 'scale' must equal the number of columns of 'x'
## More specifically, scale.default fails as called from lars():
normx <- new("dgeMatrix",
x = c(4, 0, 9, 1, 1, -1, 4, -2, 6, 6)*1e-14, Dim = c(1L, 10L),
Dimnames = list(NULL,
c("x.age", "x.sex", "x.bmi", "x.map", "x.tc",
"x.ldl", "x.hdl", "x.tch", "x.ltg", "x.glu")))
scale.default(x, center=FALSE, scale = normx)
## Error in scale.default(x, center = FALSE, scale = normx) :
## length of 'scale' must equal the number of columns of 'x'
> The problem is that this check fails because is.numeric(normx) is FALSE:
> if (is.numeric(scale) && length(scale) == nc)
> So, the error message is misleading. In fact length(scale) is the same as
> nc.
Correct, twice.
> At a minimum, the error message needs to be repaired; do we also want to
> attempt as.numeric(normx) (which I believe would have allowed scale to work
> in this case)?
It seems sensible to allow both 'center' and 'scale' to only
have to *obey* as.numeric(.) rather than fulfill is.numeric(.).
Though that is not a bug in scale() as its help page has always
said that 'center' and 'scale' should either be a logical value
or a numeric vector.
For that reason I can really claim a bug in 'lars' which should
really not use
scale(x, FALSE, normx)
but rather
scale(x, FALSE, scale = as.numeric(normx))
and then all would work.
> -----------------
> (I'm aware that there's some import issues in lars, as the offending line
> to create normx *should* work, as is.numeric(sqrt(drop(rep(1, nrow(x)) %*%
> (x^2)))) is TRUE -- it's simply that lars doesn't import the appropriate S4
> methods)
> Michael Chirico
Yes, 'lars' has _not_ been updated since Spring 2013, notably
because its authors have been saying (for rather more than 5
years I think) that one should really use
require("glmnet")
instead.
Your point is still valid that it would be easy to enhance
base :: scale.default() so it'd work in more cases.
Thank you for that. I do plan to consider such a change in
R-devel (planned to become R 3.5.0 in April).
Martin Maechler,
ETH Zurich
More information about the R-devel
mailing list