[R] Regression through the origin

Karl Ove Hufthammer karloh at mi.uib.no
Tue May 23 13:56:52 CEST 2006


Trujillo L. skreiv:

> Sorry for the naiveness of my question but I have been trying in the
> R-help and the CRAN website without any success. I am trying to perform
> a regression through the origin (without intercept) and my main concern
> is about its evaluative statistics. It is clear for me that R squared
> does not make sense if you do not have an intercept in your model

There are many different definitions of R². Most of them are equivalent
*only* for the simple linear regression model, y = a + b · x + ε.

R (the program) use a different formula for calculating R² when you fit a
regression through the origin than for a simple linear regression; and the
definition used *does* make sense. (Some other statistics software use the
same definition in the two cases, which makes the resulting statistic
meaningless in the case of regression through the origin.)

The (IHMO) most sensible way to interprete R² is as the proportional
reduction in variation when fitting a more complex model over fitting a
simpler model. See this excellent paper for a thorough discussion:

Anderson-Sprecher R. (1994). ‘Model comparisons and R²’. The American
Statistician, volume 48, number 2, pages 113–117.

And for a discussion of the different definitions of R², especially when
using transformed variables, see:

Kvålseth T.O. (1985). ‘Cautionary note about R²’. The American Statistician,
volume 39, number 4, pages 279–285.

> If I still want to perform a regression in that
> conditions, does R have any options to evaluate the model adequacy
> correctly?

Try some diagnostic plots:

x = rnorm(100)
y = 1 + x + rnorm(100)
l=lm(y~x-1) # Note: Wrong model (we need an intercept)!
summary(l)
par(mfrow=c(2,2))
plot(l)

As you can see, the fit is not very good. You might also want to try:

plot(y~x)
abline(l,col="red")

-- 
Karl Ove Hufthammer



More information about the R-help mailing list