[R] Different Lambdas and Coefficients between cv.glmnet and intercept = FALSE

Tue Feb 23 23:14:22 CET 2021

Please note, per the posting guide linked below:

"*Questions about statistics:* The R mailing lists are primarily intended
for questions and discussion about the R software. However, questions about
statistical methodology are sometimes posted. If the question is well-asked
and of interest to someone on the list, it *may* elicit an informative
up-to-date answer. See also the Usenet groups sci.stat.consult (applied
statistics and consulting) and sci.stat.math (mathematical stat and
probability). "  -- also stats.stackexchange.com

Also:
"For questions about functions in standard packages distributed with R (see
the FAQ Add-on packages in R
<https://cran.r-project.org/doc/FAQ/R-FAQ.html#Add-on-packages-in-R>), ask
questions on R-help.
If the question relates to a *contributed package* , e.g., one downloaded
from CRAN, try contacting the package maintainer first. You can also use
find("functionname") and packageDescription("packagename") to find this
information. *Only* send such questions to R-help or R-devel if you get no
reply or need further assistance. This applies to both requests for help
and to bug reports." -- see also ?maintainer

Your query seems to be mostly statistical in nature and certainly about a
non-standard package (glmnet), so if you do not get a useful response here
within a few days -- you might despite the above -- try the above.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Feb 23, 2021 at 1:07 PM <kevinegan31 using gmail.com> wrote:

> Hello,
>
> I'm currently reviewing how to correctly implement `glmnet` and am having
> a hard time understanding why the results seem to be different between each
> method when `intercept = TRUE/FALSE` as I thought it should just drop the
> intercept from the model. However, it seems to be acting a bit different
> and I'm not sure how.
>
> For a given lambda, if both `X` and `y` are scaled, it appears we can
> identify the same results:
> ```
> library(glmnet)
> data(QuickStartExample)
> lambda_grid <- 10 ^ seq(10, -2, length = 100)
> With_Intercept<-glmnet(scale(x),c(scale(y)))
> Without_Intercept<-glmnet(scale(x),c(scale(y)), intercept=FALSE)
> # Extract coefficients at a single value of lambda
> cbind(coef(With_Intercept,s=0.01), coef(Without_Intercept,s=0.01))[-1,]
> ```
> While this is good, it's not clear to me how to put these back into their
> original scale. Further, this is for a given value of lambda. When using
> `cv.glmnet`, I'd like to identify the optimal lambda such that:
> ```
> With_Intercept <- cv.glmnet(scale(x),c(scale(y)), lambda = lambda_grid)
> Without_Intercept <- cv.glmnet(scale(x), c(scale(y)), lambda =
> lambda_grid, intercept=FALSE)
> cbind(coef(With_Intercept, s=With_Intercept$lambda.min, exact = TRUE, x =
> scale(x), y = scale(y)),
>       coef(Without_Intercept, s=Without_Intercept$lambda.min, exact =
> TRUE, x = scale(x), y = scale(y)))[-1,]
> ```
> If I use `With_Intercept$lambda.min` to identify the `Without_Intercept`
> model, I get the same coefficients, but this doesn't necessarily give me
> confidence in what is the right model to use. Further, I'm not sure how to
> put the coefficients back into the right scale.
>
> I've tried to compare all of the possible combinations between
> standardising, scaling, and leaving the variables as they are, but I'm
> still struggling with the best method and how to ensure I'm implementing
> `glmnet` correctly.
>
> If anyone has advice on how to proceed and interpret these methods or get
> consistent results I would appreciate it. I've been reading the
> Introduction to Statistical Learning, Elements of Statistical Learning,
> Statistical Learning and Sparsity, as well as the `glmnet` vignette but am
> still a bit unclear.
>
> Thanks,
>
> Kevin
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]