[Rd] Bug: wrong R-squared in lm formula w/o intercept (PR#7127)

Deepayan Sarkar deepayan at stat.wisc.edu
Thu Jul 22 05:15:23 CEST 2004


On Wednesday 21 July 2004 22:22, adriano at techoffix.org wrote:
> Full_Name: Adriano Azevedo Filho
> Version: 1.9.1
> OS: Windows, Linux
> Submission from: (NULL) (200.171.246.212)
>
>
> R-squared and Adjusted R-squared appear to be wrong when
> the formula in lm() is specified without intercept. Problem
> present in both Windows and Linux 1.9.1 version. Also
> in the 1.8.1 version for Windows (other versions not
> checked).
> Possible example which reproduces the problem:
> x<-1:10
> y<-c(2,4,3,4,6,9,10,12,15,13)
> summary(lm(y~x)) # with intercept, result is OK
> #Residual standard error: 1.329 on 8 degrees of freedom
> #Multiple R-Squared: 0.9262,     Adjusted R-squared: 0.917
> summary(lm(y~x-1)) # without intercept,
> #Residual standard error: 1.26 on 9 degrees of freedom
> #Multiple R-Squared: 0.9821,     Adjusted R-squared: 0.9802
> #>>>> Not possible to have a R-squared larger in the restricted
> #>>>> model (forced without intercept)

Didn't it seem a bit odd to you that there would be such a fundamental 
bug in something so basic in such a widely used statistical software? 
Perhaps you should have considered the other possibility (that you are 
interpreting the R-Squared incorrectly) and posted this on r-help 
rather than filing it as a bug. Please read the paragraph titled 
"Surprising behavior and bugs" in the posting guide.

R^2 is a measure of how much of the variability in the response is 
explained by a model compared to a baseline model. Normally, the 
baseline model is taken to be the model with an intercept and nothing 
else. However, this doesn't make sense when the fitted model doesn't 
have an intercept, because the model with intercept only is not a 
submodel. In other words, the R^2 for the two models you have fit are 
calculated with different baseline models, and hence are not 
comparable.

Deepayan



More information about the R-devel mailing list