[R] Problems with normality req. for ANOVA

Mon Aug 2 19:32:18 CEST 2010

On Aug 2, 2010, at 9:33 AM, wwreith wrote:

>
> I am conducting an experiment with four independent variables each  
> of which
> has three or more factor levels. The sample size is quite large i.e.  
> several
> thousand. The dependent variable data does not pass a normality test  
> but
> "visually" looks close to normal so is there a way to compute the  
> affect
> this would have on the p-value for ANOVA or is there a way to  
> perform an
> nonparametric test in R that will handle this many independent  
> variables.
> Simply saying ANOVA is robust to small departures from normality is  
> not
> going to be good enough for my client.

The statistical assumption of normality for linear models do not apply  
to the distribution of the dependent variable, but rather to the  
residuals after a model is estimated. Furthermore, it is the  
homoskedasticity assumption that is more commonly violated and also  
greater threat to validity. (And if you don't already know both of  
these points, then you desperately need to review your basic modeling  
practices.)

>  I need to compute an error amount for
> ANOVA or find a nonparametric equivalent.

You might get a better answer if you expressed the first part of that  
question in unambiguous terminology.  What is "error amount"?

For the second part, there is an entire Task View on Robust  
Statistical Methods.

-- 

David Winsemius, MD
West Hartford, CT