[R] R-squared with and without constant
Tim Calkins
tcalkins at gmail.com
Wed Nov 22 00:00:52 CET 2006
Greetings Listers!
the R-squared value reported by summary of lm is calculated as
1 - RSS/RSS_m
where RSS_m is the residual sum of squares of a minimal model. In
most cases, the minimal model is simply y = mean(y), but when a
constant is left out of the model, the minimal model is y = 0.
However, if you manually add a constant, R still considers y = 0 the
minimal model. This also causes different F stats, DF, and p values.
Is there a way to specify that the R-squared should be calculated
using y = mean(y)?
Here's an example:
> a <- rnorm(100,10,5)
> b <- rnorm(100,10,5)
> c <- rep(1,100)
> summary(lm(a~b))
Call:
lm(formula = a ~ b)
Residuals:
Min 1Q Median 3Q Max
-11.8677 -3.4442 -0.5625 4.1099 10.5102
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10.23724 1.05256 9.726 4.76e-16 ***
b -0.02942 0.09818 -0.300 0.765
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.799 on 98 degrees of freedom
Multiple R-Squared: 0.0009153, Adjusted R-squared: -0.009279
F-statistic: 0.08978 on 1 and 98 DF, p-value: 0.7651
> summary(lm(a ~ b + c - 1)
Call:
lm(formula = a ~ b + c - 1)
Residuals:
Min 1Q Median 3Q Max
-11.8677 -3.4442 -0.5625 4.1099 10.5102
Coefficients:
Estimate Std. Error t value Pr(>|t|)
b -0.02942 0.09818 -0.300 0.765
c 10.23724 1.05256 9.726 4.76e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.799 on 98 degrees of freedom
Multiple R-Squared: 0.8146, Adjusted R-squared: 0.8108
F-statistic: 215.3 on 2 and 98 DF, p-value: < 2.2e-16
Thanks in advance.
tim
--
Tim Calkins
0406 753 997
More information about the R-help
mailing list