[R] lm.ridge {MASS} intercept questions
Jimmy Purnell
jpurnell7 at gmail.com
Mon Apr 26 21:34:54 CEST 2010
I am trying to understand the code for lm.ridge from the MASS package.
Here is the part I am having trouble understanding:
if(Inter <- attr(Terms, "intercept"))
{
Xm <- colMeans(X[, -Inter])
Ym <- mean(Y)
p <- p - 1
X <- X[, -Inter] - rep(Xm, rep(n, p))
Y <- Y - Ym
} else Ym <- Xm <- NA
Xscale <- drop(rep(1/n, n) %*% X^2)^0.5
X <- X/rep(Xscale, rep.int(n, p))
(the full code is on page 24 here:
http://www.stats.ox.ac.uk/pub/MASS4/VR4ex.pdf )
If there is an intercept term, lm.ridge removes it and centers the
remaining X columns by subtracting each column's mean from all the
values. Then it divides the centered X's by Xscale, which is the
standard deviation (of the population). This all makes sense to me.
What I don't understand is the case where there is no intercept term.
In that case, lm.ridge does not center the X columns, and divides by
an Xscale that is not equal to the standard deviation (because the 'X'
in the Xscale formula is different). What is the reason for this
different approach? I would have thought that the X columns should be
centered and divided by their standard deviations regardless of
whether there was an intercept term.
My other question is about the case where there is an intercept term.
If one uses coef() on the results of lm.ridge, it lists the value of
the intercept along with the value of the (original scale)
coefficients. However, I cannot see any place in the lm.ridge code
where the intercept value is calculated, or how to access it from
within the code (e.g. fit$coef returns the scaled coefficients but no
intercept value, and fit$Inter just returns whether there was an
intercept or not).
Thanks,
Jimmy
More information about the R-help
mailing list