[R] Testing for normality of residuals in a regression model
John Fox
jfox at mcmaster.ca
Fri Oct 15 19:56:49 CEST 2004
Dear Andy,
At the risk of muddying the waters (and certainly without wanting to
advocate the use of normality tests for residuals), I believe that your
point #4 is subject to misinterpretation: That is, while it is true that t-
and F-tests for regression coefficients in large sample retain their
validity well when the errors are non-normal, the efficiency of the LS
estimates can (depending upon the nature of the non-normality) be seriously
compromised, not only absolutely but in relation to alternatives (e.g.,
robust regression).
Regards,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Liaw, Andy
> Sent: Friday, October 15, 2004 11:55 AM
> To: 'Federico Gherardini'; Berton Gunter
> Cc: R-help mailing list
> Subject: RE: [R] Testing for normality of residuals in a
> regression model
>
> Let's see if I can get my stat 101 straight:
>
> We learned that linear regression has a set of assumptions:
>
> 1. Linearity of the relationship between X and y.
> 2. Independence of errors.
> 3. Homoscedasticity (equal error variance).
> 4. Normality of errors.
>
> Now, we should ask: Why are they needed? Can we get away
> with less? What if some of them are not met?
>
> It should be clear why we need #1.
>
> Without #2, I believe the least squares estimator is still
> unbias, but the usual estimate of SEs for the coefficients
> are wrong, so the t-tests are wrong.
>
> Without #3, the coefficients are, again, still unbiased, but
> not as efficient as can be. Interval estimates for the
> prediction will surely be wrong.
>
> Without #4, well, it depends. If the residual DF is
> sufficiently large, the t-tests are still valid because of
> CLT. You do need normality if you have small residual DF.
>
> The problem with normality tests, I believe, is that they
> usually have fairly low power at small sample sizes, so that
> doesn't quite help. There's no free lunch: A normality test
> with good power will usually have good power against a fairly
> narrow class of alternatives, and almost no power against
> others (directional test). How do you decide what to use?
>
> Has anyone seen a data set where the normality test on the
> residuals is crucial in coming up with appriate analysis?
>
> Cheers,
> Andy
>
> > From: Federico Gherardini
> >
> > Berton Gunter wrote:
> >
> > >>>Exactly! My point is that normality tests are useless for
> > this purpose for
> > >>>reasons that are beyond what I can take up here.
> > >>>
> > Thanks for your suggestions, I undesrtand that! Could you possibly
> > give me some (not too complicated!) links so that I can investigate
> > this matter further?
> >
> > Cheers,
> >
> > Federico
> >
> > >>>Hints: Balanced designs are
> > >>>robust to non-normality; independence (especially
> > "clustering" of subjects
> > >>>due to systematic effects), not normality is usually the
> > biggest real
> > >>>statistical problem; hypothesis tests will always reject
> > when samples are
> > >>>large -- so what!; "trust" refers to prediction validity
> > which has to do
> > >>>with study design and the validity/representativeness of
> > the current data to
> > >>>future.
> > >>>
> > >>>I know that all the stats 101 tests say to test for
> > normality, but they're
> > >>>full of baloney!
> > >>>
> > >>>Of course, this is "free" advice -- so caveat emptor!
> > >>>
> > >>>Cheers,
> > >>>Bert
> > >>>
> > >>>
> > >>>
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list