[R] shapiro.test() output

Thu Jul 13 01:51:21 CEST 2006

Matthew,

You may find the following documents useful if your venture into environmental
statistics is serious. 

First, the 92 EPA Addendum on GW statistics--links at
http://www.epa.gov/correctiveaction/resource/guidance/sitechar/gwstats/gwstats.htm

The second is Helsel's book at the USGS

http://pubs.usgs.gov/twri/twri4a3/

Both documents have good discussions on normality tests for GW data including
probability plot correlation coefficients and variations in the (x) plotting
position--Blom, Cunane, etc.

Helsel is a good read 1.) his writing is so clear in his writing, 2.) he gets
into nonparametric approaches in so many areas of GW stats, and 3.) the
typography is nice--the book just a pleasant experience all around. Just be
advised this is only the beginning...

Oh, yes. It ain't safe to just dabble with environmental (contaminant)data--it
is too messy. Go whole hog or pass it up.

Best regards,
Michael Grant (works for the competition :O))

--- Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:

> <Matthew.Findley at ch2m.com> writes:
> 
> > R Users:
> > 
> > My question is probably more about elementary statistics than the
> > mechanics of using R, but I've been dabbling in R (version 2.2.0) and
> > used it recently  to test some data . 
> > 
> > I have a relatively small set of observations (n = 12) of arsenic
> > concentrations in background groundwater and wanted to test my
> > assumption of normality.  I used the Shapiro-Wilk test (by calling
> > shapiro.test() in R) and I'm not sure how to interpret the output.
> > Here's the input/output from the R console:
> > 
> > 	>As = c(13, 17, 23, 9.5, 20, 15, 11, 17, 21, 14, 22, 13)
> > 	>shapiro.test(As)
> > 
> >       	  Shapiro-Wilk normality test
> > 
> > 	data:  As 
> > 	W = 0.9513, p-value = 0.6555
> > 
> > How do I interpret this?  I understand, from poking around the internet,
> > that the higher the W statistic the "more normal" the data.
> > 
> > What is the null hypothesis - that the data is normally distributed?  
> 
> Yup.
> 
> > What does the p-value tell me?  65.55% chance of what - getting
> > W-statistic greater than or equal to 0.9513 (I picked this up from the
> > Dalgaard book, Introductory Statistics with R, but its not really
> > sinking in with respect to how it applies to a Shipiro Wilk test).? 
> 
> *Smaller* or equal - W=1.0 is the "perfect fit". The W statistic is
>  pretty much the Pearson correlation applied to the curve drawn by
>  qqnorm(). (The exact definition of what goes on the x axis differs
>  slightly, I believe.) 
> 
> A low p-value would indicate that the W is too extreme to be explained
> by chance variation - i.e. evidence against normal distribution.
> In the present case you have no evidence against normal distribution
> (beware that this is not evidence _for_ normality).
> 
> (Personally, I'm not too happy about these normality tests. They tend
> to lack power in small samples and in large samples they often reject
> distributions which  are perfectly adequate for normal-theory
> analysis. Learning to evaluate a QQ plot seems a better idea.) 
> 
>  
> > The method description - retrieved using ?shapiro.test() - is a bit
> > light on details.
> 
> There are references therein, though...
> 
> -- 
>    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>