[R] detect if data is normal or skewed (without a boxplot)

Felipe Carrillo mazatlanmexico at yahoo.com
Mon Aug 11 19:55:38 CEST 2008


Thanks Jim and Ben for your replies, Reading further about data normalization found shapiro.test. I understand that if the p-value is smaller than 0.05 then the data isn't normal, I just don't understand what the "W" means.




Hi Felipe,
Here's one way:

library(nortest)
if(sf.test(fishlength)$p.value>0.05) t.test(fishlength)
else wilcox.test(fishlength)

Jim

> Felipe Carrillo <mazatlanmexico <at> yahoo.com>
> writes:
> 
> > 
> > Hello all:
> > Is there a way to detect in R if a dataset is normally
> distributed or skewed
> without graphically seeing it?
> > The reason I want to be able to do this is because I
> have developed and
> application with Visual Basic where
> > Word,Access and Excel "talk" to each other
> and I want to integrate R to this
> application to estimate
> > confidence intervals on fish sizes (mm). I basically
> want to automate the
> process from Excel by detecting
> > if my data has a normal distribution then use t.test,
> but if my data is skewed
> then use wilcox.test.
> > Something like the pseudo code below:
> > 
> > fishlength <-
> c(35,32,37,39,42,45,37,36,35,34,40,42,41,50)
> >    if fishlength= "normally distributed"
> then
> >  t.test(fishlength)
> > else
> > wilcox.text(fishlength)
> > 
> > I hope this isn't very confussing
> > 
> > Felipe D. Carrillo  
> > Supervisory Fishery Biologist  
> > Department of the Interior  
> > US Fish & Wildlife Service  
> > California, USA
> 
> 
>   There's a whole package (nortest) devoted to tests of
> normality,
> BUT: I would suggest that your procedure is not a good
> idea.
> It's often hard to detect non-normality, and "fail
> to reject"
> shouldn't mean "accept".  If you're
> concerned about non-normality,
> you should probably just use the Wilcoxon test all the time
> (it has about 95% of the power of the t-test if the data
> are
> normal: http://en.wikipedia.org/wiki/Mann-Whitney_U ), or 
> use robust statistics (e.g. rlm in the MASS package).
> 
>   Ben Bolker



More information about the R-help mailing list