[R] detect if data is normal or skewed (without a boxplot)

Ben Bolker bolker at ufl.edu
Mon Aug 11 18:42:09 CEST 2008


Felipe Carrillo <mazatlanmexico <at> yahoo.com> writes:

> 
> Hello all:
> Is there a way to detect in R if a dataset is normally distributed or skewed
without graphically seeing it?
> The reason I want to be able to do this is because I have developed and
application with Visual Basic where
> Word,Access and Excel "talk" to each other and I want to integrate R to this
application to estimate
> confidence intervals on fish sizes (mm). I basically want to automate the
process from Excel by detecting
> if my data has a normal distribution then use t.test, but if my data is skewed
then use wilcox.test.
> Something like the pseudo code below:
> 
> fishlength <- c(35,32,37,39,42,45,37,36,35,34,40,42,41,50)
>    if fishlength= "normally distributed" then
>  t.test(fishlength)
> else
> wilcox.text(fishlength)
> 
> I hope this isn't very confussing
> 
> Felipe D. Carrillo  
> Supervisory Fishery Biologist  
> Department of the Interior  
> US Fish & Wildlife Service  
> California, USA


  There's a whole package (nortest) devoted to tests of normality,
BUT: I would suggest that your procedure is not a good idea.
It's often hard to detect non-normality, and "fail to reject"
shouldn't mean "accept".  If you're concerned about non-normality,
you should probably just use the Wilcoxon test all the time
(it has about 95% of the power of the t-test if the data are
normal: http://en.wikipedia.org/wiki/Mann-Whitney_U ), or 
use robust statistics (e.g. rlm in the MASS package).

  Ben Bolker



More information about the R-help mailing list