[Rd] scale.default gives an incorrect error message when is.numeric() fails on a dgeMatrix
Michael Chirico
michaelchirico4 at gmail.com
Fri Mar 2 01:27:07 CET 2018
thanks. I know the setup code is a mess, just duct-taped something together
from the examples in lars (which are a mess in turn). in fact when I
messaged Prof. Hastie he recommended using glmnet. I wonder why lars is
kept on CRAN if they've no intention of maintaining it... but I digress...
On Mar 2, 2018 1:52 AM, "Martin Maechler" <maechler at stat.math.ethz.ch>
wrote:
> >>>>> Michael Chirico <michaelchirico4 at gmail.com>
> >>>>> on Tue, 27 Feb 2018 20:18:34 +0800 writes:
>
> Slightly amended 'Subject': (unimportant mistake: a dgeMatrix is *not*
> sparse)
>
> MM: modified to commented R code, slightly changed from your post:
>
>
> ## I am attempting to use the lars package with a sparse input feature
> matrix,
> ## but the following fails:
>
> library(Matrix)
> library(lars)
> data(diabetes) # from 'lars'
> ##UAagghh! not like this -- both attach() *and* as.data.frame() are
> horrific!
> ##UA attach(diabetes)
> ##UA x = as(as.matrix(as.data.frame(x)), 'dgCMatrix')
> x <- as(unclass(diabetes$x), "dgCMatrix")
> lars(x, y, intercept = FALSE)
> ## Error in scale.default(x, FALSE, normx) :
> ## length of 'scale' must equal the number of columns of 'x'
>
> ## More specifically, scale.default fails as called from lars():
> normx <- new("dgeMatrix",
> x = c(4, 0, 9, 1, 1, -1, 4, -2, 6, 6)*1e-14, Dim = c(1L, 10L),
> Dimnames = list(NULL,
> c("x.age", "x.sex", "x.bmi", "x.map", "x.tc",
> "x.ldl", "x.hdl", "x.tch", "x.ltg", "x.glu")))
> scale.default(x, center=FALSE, scale = normx)
> ## Error in scale.default(x, center = FALSE, scale = normx) :
> ## length of 'scale' must equal the number of columns of 'x'
>
> > The problem is that this check fails because is.numeric(normx) is FALSE:
>
> > if (is.numeric(scale) && length(scale) == nc)
>
> > So, the error message is misleading. In fact length(scale) is the same
> as
> > nc.
>
> Correct, twice.
>
> > At a minimum, the error message needs to be repaired; do we also want to
> > attempt as.numeric(normx) (which I believe would have allowed scale to
> work
> > in this case)?
>
> It seems sensible to allow both 'center' and 'scale' to only
> have to *obey* as.numeric(.) rather than fulfill is.numeric(.).
>
> Though that is not a bug in scale() as its help page has always
> said that 'center' and 'scale' should either be a logical value
> or a numeric vector.
>
> For that reason I can really claim a bug in 'lars' which should
> really not use
>
> scale(x, FALSE, normx)
>
> but rather
>
> scale(x, FALSE, scale = as.numeric(normx))
>
> and then all would work.
>
> > -----------------
>
> > (I'm aware that there's some import issues in lars, as the offending
> line
> > to create normx *should* work, as is.numeric(sqrt(drop(rep(1, nrow(x))
> %*%
> > (x^2)))) is TRUE -- it's simply that lars doesn't import the
> appropriate S4
> > methods)
>
> > Michael Chirico
>
> Yes, 'lars' has _not_ been updated since Spring 2013, notably
> because its authors have been saying (for rather more than 5
> years I think) that one should really use
>
> require("glmnet")
>
> instead.
>
> Your point is still valid that it would be easy to enhance
> base :: scale.default() so it'd work in more cases.
>
> Thank you for that. I do plan to consider such a change in
> R-devel (planned to become R 3.5.0 in April).
>
> Martin Maechler,
> ETH Zurich
>
>
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list