[R-sig-ME] interpreting significance from lmer results for dummies (like me)
Andrew Robinson
A.Robinson at ms.unimelb.edu.au
Sat Apr 26 09:59:21 CEST 2008
Hi Mark,
On Fri, Apr 25, 2008 at 11:53:24PM -0400, Mark Kimpel wrote:
> I am a bioinformatistician, with my strongest background in molecular
> biology. I have been trying to learn about mixed-effects to improve the
> analysis of my experiments, which certainly contain random effects. I will
> admit to being totally lost in the discussions regarding lack of p-value
> reporting in the current versions of lmer. Furthermore, I suspect those that
> need to publish to non-statistical journals will face reviewers who are
> equally in the dark. Where can I find a biologist-level explanation of the
> current controversy,
I'll take a stab.
1) the traditional, Fisher-style test of a null hypothesis is based on
computing the probability of observing a test statistic as extreme
or more extreme than the one actually observed, assuming that the
null hypothesis is true. This probability is called the p-value.
If the p-value is less than some cut-off, e.g. 0.01, then the null
hypothesis is rejected.
2) in order to compute that p-value, we need to know the cumulative
distribution function of the test statistic when the null
hypothesis is true. In simple cases this is easy: for example, we
use the t-distribution for the comparison of two normal means (with
assumed equal variances etc).
3) in (many) hierarchical models the cumulative distribution function
of the test statistic when the null hypothesis is true is simply not
known. So, we can't compute the p-value.
3a) in a limited range of hierarchical models that have historically
dominated analysis of variance, e.g. split-plot designs, the
reference distribution is known (it's F).
3b) Numerous experts have (quite reasonably) built up a bulwark of
intuitive knowledge about the analysis of such designs.
3c) the intuition does not necessarily pertain to the analysis of any
arbitrary hierarchical design, which might be unbalanced, and have
crossed random effects. That is, the intuition might be applied,
but inappropriately.
4) in any case, the distribution that is intuitively or otherwise
assumed is the F, because it works in the cases mentioned in 3a.
All that remains is to define the degrees of freedom. The
numerator degrees of freedom are obvious, but the denominator
degrees of freedom are not known.
4a) numerous other packages supply approximations to the denominator
degrees of freedom, eg Satterthwaite, and KR (which is related).
They have been subjected to a modest degree of scrutiny by
simulation.
5) however, it is not clear that the reference distribution is really
F at all, and therefore it is not clear that correcting the
denominator degrees of freedom is what is needed. Confusion reigns
on how the p-values should be computed. And because of this
confusion, Doug Bates declines to provide p-values.
> how can I learn how to properly judge significance from my lmer
> results,
There are numerous approximations, but no way to properly judge
significance as far as I am aware. Try the R-wiki for algorithms, and
be conservative.
http://wiki.r-project.org/rwiki/doku.php
Or, use lme, report the p-values computed therein, and be aware that
they are not necessarily telling you exactly what you want to know.
> and what peer-reviewed references can I steer reviewers
> towards?
Not sure about that one. I'm working on some simulations with Doug
but it's slow going, mainly because I'm chronically disorganised.
> I understand, from other threads, that some believe a paradigm shift
> away from p-values may be necessary, but I it is not clear to me
> what paradigm will replace this entrenced view. I can appreciate the
> fact that there may be conflicting opinions about the best
> equations/algorithms for determining significance, but is there any
> agreement on the goal we are heading towards?
The conflict is not about p-values per se, but about the way that they
are calculated. I would bet that the joint goal is to find an
algorithm that provides robust, reasonable inference in a sufficiently
wide variety of cases that its implementation proves to be worthwhile.
I hope that this was helpful.
Andrew
--
Andrew Robinson
Department of Mathematics and Statistics Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/
More information about the R-sig-mixed-models
mailing list