[R] multiple regression w/ no intercept; strange results

Thomas Lumley tlumley at u.washington.edu
Mon Jun 29 19:02:05 CEST 2009


On Mon, 29 Jun 2009, John Hunter wrote:


> But my question was more numerical: in particular, the R^2 of the
> model should be equal to the square of the correlation between the fit
> values and the actual values.

No.

> It is with the intercept and is not w/o
> it, as my code example shows.  Am I correct in assuming these should
> always be the same, and if they are not, does it reflect a bug in R or
> perhaps a numerical instability?
>

No.

The R^2 is based on dividing the sum of squared errors in the model by the sum of squared errors in the null model ('proportion of variation explained')

For a model with no intercept, the null model is mu=0, so the R^2 is the sum of squared residuals divided by the sum of squared y values.

One could define the R^2 as you expected, and arguments could be made either way. The definition that lm uses keeps the connection to the likelihood that your definition loses in the no-intercept case.

       -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle




More information about the R-help mailing list