[R] Conservative "ANOVA tables" in lmer

Martin Maechler maechler at stat.math.ethz.ch
Thu Sep 7 16:52:32 CEST 2006


>>>>> "DB" == Douglas Bates <bates at stat.wisc.edu>
>>>>>     on Thu, 7 Sep 2006 07:59:58 -0500 writes:

    DB> Thanks for your summary, Hank.
    DB> On 9/7/06, Martin Henry H. Stevens <hstevens at muohio.edu> wrote:
    >> Dear lmer-ers,
    >> My thanks for all of you who are sharing your trials and tribulations
    >> publicly.

    >> I was hoping to elicit some feedback on my thoughts on denominator
    >> degrees of freedom for F ratios in mixed models. These thoughts and
    >> practices result from my reading of previous postings by Doug Bates
    >> and others.

    >> - I start by assuming that the appropriate denominator degrees lies
    >> between n - p and and n - q, where n=number of observations, p=number
    >> of fixed effects (rank of model matrix X), and q=rank of Z:X.

    DB> I agree with this but the opinion is by no means universal.  Initially
    DB> I misread the statement because I usually write the number of columns
    DB> of Z as q.

    DB> It is not easy to assess rank of Z:X numerically.  In many cases one
    DB> can reason what it should be from the form of the model but a general
    DB> procedure to assess the rank of a matrix, especially a sparse matrix,
    DB> is difficult.

    DB> An alternative which can be easily calculated is n - t where t is the
    DB> trace of the 'hat matrix'.  The function 'hatTrace' applied to a
    DB> fitted lmer model evaluates this trace (conditional on the estimates
    DB> of the relative variances of the random effects).

    >> - I then conclude that good estimates of P values on the F ratios lie
    >>   between 1 - pf(F.ratio, numDF, n-p) and 1 - pf(F.ratio, numDF, n-q).
    >>   -- I further surmise that the latter of these (1 - pf(F.ratio, numDF,
    >>   n-q)) is the more conservative estimate.

This assumes that the true distribution (under H0) of that "F ratio"
*is*  F_{n1,n2}  for some (possibly non-integer)  n1 and n2.
But AFAIU, this is only approximately true at best, and AFAIU,
the quality of this approximation has only been investigated
empirically for some situations. 
Hence, even your conservative estimate of the P value could be
wrong (I mean "wrong on the wrong side" instead of just
"conservatively wrong").  Consequently, such a P-value is only
``approximately conservative'' ...
I agree howevert that in some situations, it might be a very
useful "descriptive statistic" about the fitted model.

Martin

    >> When I use these criteria and compare my "ANOVA" table to the results
    >> of analysis of Helmert contrasts using MCMC sample with highest
    >> posterior density intervals, I find that my conclusions (e.g. factor
    >> A, with three levels, has a "significant effect" on the response
    >> variable) are qualitatively the same.

    >> Comments?

    DB> I would be happy to re-institute p-values for fixed effects in the
    DB> summary and anova methods for lmer objects using a denominator degrees
    DB> of freedom based on the trace of the hat matrix or the rank of Z:X if
    DB> others will volunteer to respond to the "these answers are obviously
    DB> wrong because they don't agree with <whatever> and the idiot who wrote
    DB> this software should be thrashed to within an inch of his life"
    DB> messages.  I don't have the patience.

    DB> ______________________________________________
    DB> R-help at stat.math.ethz.ch mailing list
    DB> https://stat.ethz.ch/mailman/listinfo/r-help
    DB> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    DB> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list