[R] normal distribution assumption for multi-level modelling

Bert Gunter gunter.berton at gene.com
Wed Apr 18 20:55:45 CEST 2012


Cecile:

On Wed, Apr 18, 2012 at 8:21 AM, Cecile De Cat <c.decat at leeds.ac.uk> wrote:
> Hello,
>
> I'm analysing reaction time data from a linguistic experiment (a variant of
> a lexical decision task).   To ascertain that the data was normally
> distributed, I used *shapiro.test *for each participant (see commands
> below), but only one out of 21 returns a p value above p.0 05.
>
>> f = function(dfr) return(shapiro.test(dfr$Target.RTinv)$p.value)
>> p = as.vector(by(newdat, newdat$Subject, f))
>> names(p) = levels(newdat$Subject)
>> names(p[p < 0.05])
>
> Removing a few outliers

!! Yikes!! I won't say "Don't do this." But I will say that this can
be a very dangerous and unscientific thing to do, leading to biased,
misleading results.

 per subject doesn't make a difference, and
> "aggressive" removal of outliers (done by subject, for each of the 6
> conditions ) still results in non-normally distributed data by subject.
>
> Does this invalidate any attempt at multi-level modelling?

How can we possibly know without knowing in detail the objectives of
the investigation, the nature of the data, and the details of the
analysis you did??!

On general principles, normality is rarely of any real importance;
lack of independence (or, in general, non-adherence to the covariance
structures specified) usually is.  So "any attempt" seems too general
a claim to support. Indeed, a good graphical analysis -- often the
most scientifically informative thing to do anyway -- is almost always
a good thing to do.

As this has little to do with R, you should follow up on a statistical
list, like stats.stackexchange.com .

-- Bert
>
> Many thanks in advance for your help.
>
> Cecile
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list