[R] shapiro.test() output
Michael Grant
mwgrant2001 at yahoo.com
Thu Jul 13 01:51:21 CEST 2006
Matthew,
You may find the following documents useful if your venture into environmental
statistics is serious.
First, the 92 EPA Addendum on GW statistics--links at
http://www.epa.gov/correctiveaction/resource/guidance/sitechar/gwstats/gwstats.htm
The second is Helsel's book at the USGS
http://pubs.usgs.gov/twri/twri4a3/
Both documents have good discussions on normality tests for GW data including
probability plot correlation coefficients and variations in the (x) plotting
position--Blom, Cunane, etc.
Helsel is a good read 1.) his writing is so clear in his writing, 2.) he gets
into nonparametric approaches in so many areas of GW stats, and 3.) the
typography is nice--the book just a pleasant experience all around. Just be
advised this is only the beginning...
Oh, yes. It ain't safe to just dabble with environmental (contaminant)data--it
is too messy. Go whole hog or pass it up.
Best regards,
Michael Grant (works for the competition :O))
--- Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:
> <Matthew.Findley at ch2m.com> writes:
>
> > R Users:
> >
> > My question is probably more about elementary statistics than the
> > mechanics of using R, but I've been dabbling in R (version 2.2.0) and
> > used it recently to test some data .
> >
> > I have a relatively small set of observations (n = 12) of arsenic
> > concentrations in background groundwater and wanted to test my
> > assumption of normality. I used the Shapiro-Wilk test (by calling
> > shapiro.test() in R) and I'm not sure how to interpret the output.
> > Here's the input/output from the R console:
> >
> > >As = c(13, 17, 23, 9.5, 20, 15, 11, 17, 21, 14, 22, 13)
> > >shapiro.test(As)
> >
> > Shapiro-Wilk normality test
> >
> > data: As
> > W = 0.9513, p-value = 0.6555
> >
> > How do I interpret this? I understand, from poking around the internet,
> > that the higher the W statistic the "more normal" the data.
> >
> > What is the null hypothesis - that the data is normally distributed?
>
> Yup.
>
> > What does the p-value tell me? 65.55% chance of what - getting
> > W-statistic greater than or equal to 0.9513 (I picked this up from the
> > Dalgaard book, Introductory Statistics with R, but its not really
> > sinking in with respect to how it applies to a Shipiro Wilk test).?
>
> *Smaller* or equal - W=1.0 is the "perfect fit". The W statistic is
> pretty much the Pearson correlation applied to the curve drawn by
> qqnorm(). (The exact definition of what goes on the x axis differs
> slightly, I believe.)
>
> A low p-value would indicate that the W is too extreme to be explained
> by chance variation - i.e. evidence against normal distribution.
> In the present case you have no evidence against normal distribution
> (beware that this is not evidence _for_ normality).
>
> (Personally, I'm not too happy about these normality tests. They tend
> to lack power in small samples and in large samples they often reject
> distributions which are perfectly adequate for normal-theory
> analysis. Learning to evaluate a QQ plot seems a better idea.)
>
>
> > The method description - retrieved using ?shapiro.test() - is a bit
> > light on details.
>
> There are references therein, though...
>
> --
> O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list