[R] Interpreting lm Residuals...
David Riebel
driebel at pha.jhu.edu
Mon Jun 21 16:27:41 CEST 2010
I am using the lm function in R to fit several linear models to a
fair-sized dataset (~160 collections of ~1000 data points each). My
data have intrinsic, systematic uncertainty much greater than the
measurement errors on any individual point. My thought is to use the
residuals of my linear fits to quantify this intrinsic uncertainty, but
I am puzzled over the correct interpretation of R's output.
I have attached plots of the fit and the residuals to one of my
sub-groups, for illustration. By eye, the overwhelming majority of the
residuals are within +- 0.4, and I would therefore expect the standard
error of the residuals to be ~0.2. However, the output from lm does not
show this:
>summary(ofit)
Call:
lm(formula = omag ~ oper, weights = (1/oerr))
Residuals:
Min 1Q Median 3Q Max
-3.32185 -0.41181 0.03983 0.40041 2.52971
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.52847 0.03979 490.8 <2e-16 ***
oper -4.25297 0.02101 -202.4 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.6705 on 2287 degrees of freedom
Multiple R-squared: 0.9471, Adjusted R-squared: 0.9471
F-statistic: 4.097e+04 on 1 and 2287 DF, p-value: < 2.2e-16
The plot thickens when I examine the residuals themselves:
>summary(resid(ofit))
Min. 1st Qu. Median Mean 3rd Qu. Max.
-0.611800 -0.095720 0.010200 0.005954 0.101100 0.680700
> sd(resid(ofit))
[1] 0.1533568
These numbers are much more what I see by eye. There really aren't any
residuals outside ~0.6, certainly nothing as large as 3.3! The help
feature for lm tells me that the residuals are "the residuals, that is
response minus fitted values." Exactly what I would expect. As an
Astronomer, my knowledge of statistics is rather "workman-like" if you
will, but to me, "Residual standard error" means "the standard deviation
of the residuals," but the lm output doesn't seem to agree with this.
I'd appreciate it if someone could clarify what's being output by the
summary function acting on an lm object.
Replies by e-mail preferred.
Thanks,
David Riebel
Graduate Research Assistant
Johns Hopkins University
Department of Physics and Astronomy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: o_seq2_fit.ps
Type: application/postscript
Size: 58665 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100621/e6a8fb56/attachment.ps>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: o_seq2_resid.ps
Type: application/postscript
Size: 58730 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100621/e6a8fb56/attachment-0001.ps>
More information about the R-help
mailing list