[R] normality tests [Broadcast]

Frank E Harrell Jr f.harrell at vanderbilt.edu
Fri May 25 23:43:46 CEST 2007


gatemaze at gmail.com wrote:
> Thank you all for your replies.... they have been more useful... well
> in my case I have chosen to do some parametric tests (more precisely
> correlation and linear regressions among some variables)... so it
> would be nice if I had an extra bit of support on my decisions... If I
> understood well from all your replies... I shouldn't pay soooo much
> attntion on the normality tests, so it wouldn't matter which one/ones
> I use to report... but rather focus on issues such as the power of the
> test...

If doing regression I assume your normality tests were on residuals 
rather than raw data.

Frank

> 
> Thanks again.
> 
> On 25/05/07, Lucke, Joseph F <Joseph.F.Lucke at uth.tmc.edu> wrote:
>>  Most standard tests, such as t-tests and ANOVA, are fairly resistant to
>> non-normalilty for significance testing. It's the sample means that have
>> to be normal, not the data.  The CLT kicks in fairly quickly.  Testing
>> for normality prior to choosing a test statistic is generally not a good
>> idea.
>>
>> -----Original Message-----
>> From: r-help-bounces at stat.math.ethz.ch
>> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Liaw, Andy
>> Sent: Friday, May 25, 2007 12:04 PM
>> To: gatemaze at gmail.com; Frank E Harrell Jr
>> Cc: r-help
>> Subject: Re: [R] normality tests [Broadcast]
>>
>> From: gatemaze at gmail.com
>> >
>> > On 25/05/07, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:
>> > > gatemaze at gmail.com wrote:
>> > > > Hi all,
>> > > >
>> > > > apologies for seeking advice on a general stats question. I ve run
>>
>> > > > normality tests using 8 different methods:
>> > > > - Lilliefors
>> > > > - Shapiro-Wilk
>> > > > - Robust Jarque Bera
>> > > > - Jarque Bera
>> > > > - Anderson-Darling
>> > > > - Pearson chi-square
>> > > > - Cramer-von Mises
>> > > > - Shapiro-Francia
>> > > >
>> > > > All show that the null hypothesis that the data come from a normal
>>
>> > > > distro cannot be rejected. Great. However, I don't think
>> > it looks nice
>> > > > to report the values of 8 different tests on a report. One note is
>>
>> > > > that my sample size is really tiny (less than 20
>> > independent cases).
>> > > > Without wanting to start a flame war, are there any
>> > advices of which
>> > > > one/ones would be more appropriate and should be reported
>> > (along with
>> > > > a Q-Q plot). Thank you.
>> > > >
>> > > > Regards,
>> > > >
>> > >
>> > > Wow - I have so many concerns with that approach that it's
>> > hard to know
>> > > where to begin.  But first of all, why care about
>> > normality?  Why not
>> > > use distribution-free methods?
>> > >
>> > > You should examine the power of the tests for n=20.  You'll probably
>>
>> > > find it's not good enough to reach a reliable conclusion.
>> >
>> > And wouldn't it be even worse if I used non-parametric tests?
>>
>> I believe what Frank meant was that it's probably better to use a
>> distribution-free procedure to do the real test of interest (if there is
>> one) instead of testing for normality, and then use a test that assumes
>> normality.
>>
>> I guess the question is, what exactly do you want to do with the outcome
>> of the normality tests?  If those are going to be used as basis for
>> deciding which test(s) to do next, then I concur with Frank's
>> reservation.
>>
>> Generally speaking, I do not find goodness-of-fit for distributions very
>> useful, mostly for the reason that failure to reject the null is no
>> evidence in favor of the null.  It's difficult for me to imagine why
>> "there's insufficient evidence to show that the data did not come from a
>> normal distribution" would be interesting.
>>
>> Andy
>>
>>
>> > >
>> > > Frank
>> > >
>> > >
>> > > --
>> > > Frank E Harrell Jr   Professor and Chair           School
>> > of Medicine
>> > >                       Department of Biostatistics
>> > Vanderbilt University
>> > >
>> >
>> >
>> > --
>> > yianni
>> >
>> > ______________________________________________
>> > R-help at stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>>
>>
>> ------------------------------------------------------------------------
>> ------
>> Notice:  This e-mail message, together with any
>> attachments,...{{dropped}}
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list