[R] Normality test

Sat May 28 22:21:56 CEST 2011

To build on Robert's suggestion (which is very good to begin with), you might consider using the vis.test function in the TeachingDemos package with the vt.qqnorm function.  This will create the qq plot of your data along with several other qqplots of normal samples of the same size.  If you cannot tell which of the plots is your data, then your data is probably close enough to normal for most practical purposes.  It will give you a p-value based on your ability to distinguish your data from random normals if you need one.

If you need more precision, then the most precise normality test is SnowsPenultimateNormalityTest also in TeachingDemos.  However, the documentation for that function tends to be more useful than the function itself.

If you really want to choose among the different normality tests in nortest (or elsewhere) then you should really investigate what assumptions they are making and what types of alternatives they are the most powerful for.  Also decide on what types of non-normality you really care about, then use that to choose among them.  Consider the 2 distributions where one is uniform between 0 and 1 with height 1; the other also has height 1 between 0 and 0.99, but is also 1 between 999.99 and 1000, zero elsewhere.  Are these 2 distributions different in a meaningful way?  They have very different mean and variance, but for most samples they will look the same (and if you throw out outliers they will look even more similar).  The reason that different tests give different results is because they focus on different types of differences.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Robert Baer
Sent: Friday, May 27, 2011 5:28 PM
To: Salil Sharma; R-help at r-project.org
Subject: Re: [R] Normality test

> I am writing to inquire about normality test given in nortest package. I
> have a random data set consisting of 300 samples. I am curious about which
> normality test in R would give me precise measurement, whether data sample
> is following normal distribution. As p value in each test is different in
> each test, if you could help me identifying a suitable test in R for this
> medium size of data, it will be grateful.

I am neither a statistician nor an expert on these types of tests, but I'm 
guessing  that your are unlikely to get a good answer even from people with 
such qualifications as such judgments can only be made in the context of a 
specific problem.  You have not provided us with such a problem (please read 
the posting guide).

That admonishment aside, I typically start by using qqnorm() and qqline() to 
plot my data against the expected theoretical quantiles.  If your data is 
perfectly normal, the points will fall right along the line.  Skewness and 
deviations from normal by the tails produce very characteristic patterns in 
the plots which you can learn about by plotting some simulated data that is 
left-skewed, right-skewed, long tailed, or short tailed.

I personally find this graphical feedback to be a much more useful way to 
understand my data than doing a single normality test that produces a 
p-value. based upon assumptions I may not be privy to

For more, see the help by typing:
?qqnorm
?qqline

Rob

------------------------------------------
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A. T. Still University of Health Sciences
800 W. Jefferson St.
Kirksville, MO 63501
660-626-2322
FAX 660-626-2965

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.