[R] multiple regression w/ no intercept; strange results
Thomas Lumley
tlumley at u.washington.edu
Mon Jun 29 19:02:05 CEST 2009
On Mon, 29 Jun 2009, John Hunter wrote:
> But my question was more numerical: in particular, the R^2 of the
> model should be equal to the square of the correlation between the fit
> values and the actual values.
No.
> It is with the intercept and is not w/o
> it, as my code example shows. Am I correct in assuming these should
> always be the same, and if they are not, does it reflect a bug in R or
> perhaps a numerical instability?
>
No.
The R^2 is based on dividing the sum of squared errors in the model by the sum of squared errors in the null model ('proportion of variation explained')
For a model with no intercept, the null model is mu=0, so the R^2 is the sum of squared residuals divided by the sum of squared y values.
One could define the R^2 as you expected, and arguments could be made either way. The definition that lm uses keeps the connection to the likelihood that your definition loses in the no-intercept case.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-help
mailing list