[R] lm without intercept
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Fri Feb 18 12:25:36 CET 2011
On Fri, 18 Feb 2011, Jan wrote:
> Hi,
>
> I am not a statistics expert, so I have this question. A linear model
> gives me the following summary:
>
> Call:
> lm(formula = N ~ N_alt)
>
> Residuals:
> Min 1Q Median 3Q Max
> -110.30 -35.80 -22.77 38.07 122.76
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 13.5177 229.0764 0.059 0.9535
> N_alt 0.2832 0.1501 1.886 0.0739 .
> ---
> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> Residual standard error: 56.77 on 20 degrees of freedom
> (16 observations deleted due to missingness)
> Multiple R-squared: 0.151, Adjusted R-squared: 0.1086
> F-statistic: 3.558 on 1 and 20 DF, p-value: 0.07386
>
> The regression is not very good (high p-value, low R-squared).
Yes.
> The Pr value for the intercept seems to indicate that it is zero with a
> very high probability (95.35%).
Not quite. Consult your statistics textbook for the correct interpretation
of p-values. Under the null hypothesis of a true intercept of zero, it is
very likely to observe an intercept as large as 13.52 or larger.
> So I repeat the regression forcing the intercept to zero:
Do you have a good interpretation for that?
> Call:
> lm(formula = N ~ N_alt - 1)
>
> Residuals:
> Min 1Q Median 3Q Max
> -110.11 -36.35 -22.13 38.59 123.23
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> N_alt 0.292046 0.007742 37.72 <2e-16 ***
> ---
> Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> Residual standard error: 55.41 on 21 degrees of freedom
> (16 observations deleted due to missingness)
> Multiple R-squared: 0.9855, Adjusted R-squared: 0.9848
> F-statistic: 1423 on 1 and 21 DF, p-value: < 2.2e-16
>
> 1. Is my interpretation correct?
> 2. Is it possible that just by forcing the intercept to become zero, a
> bad regression becomes an extremely good one?
> 3. Why doesn't lm suggest a value of zero (or near zero) by itself if
> the regression is so much better with it?
The model without intercept needs to be interpreted differently. The
p-value pertains to a regression with intercept zero and slope 0.292
against a model with both intercept zero and slope zero. If I had to
guess, I would say this is not a very meaningful comparison for your data.
The same is true for the R-squared (see also ?summary.lm for its
definition in the case without intercept).
hth,
Z
> Please excuse my ignorance.
>
> Jan Rheinländer
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list