[R] Conservative "ANOVA tables" in lmer

Tue Sep 12 16:58:11 CEST 2006

On 9/11/06, Manuel Morales <Manuel.A.Morales at williams.edu> wrote:
> On Mon, 2006-09-11 at 11:43 -0500, Douglas Bates wrote:
> > On 9/10/06, Andrew Robinson <A.Robinson at ms.unimelb.edu.au> wrote:
> > > On Thu, Sep 07, 2006 at 07:59:58AM -0500, Douglas Bates wrote:
> > >
> > > > I would be happy to re-institute p-values for fixed effects in the
> > > > summary and anova methods for lmer objects using a denominator degrees
> > > > of freedom based on the trace of the hat matrix or the rank of Z:X if
> > > > others will volunteer to respond to the "these answers are obviously
> > > > wrong because they don't agree with <whatever> and the idiot who wrote
> > > > this software should be thrashed to within an inch of his life"
> > > > messages.  I don't have the patience.
> > >
> > > This seems to be more than fair to me.  I'll volunteer to help explain
> > > why the anova.lmer() output doesn't match SAS, etc.  Is it worth
> > > putting a caveat in the output and the help files?  Is it even worth
> > > writing a FAQ about this?
> >
> > Having made that offer I think I will now withdraw it.  Peter's
> > example has convinced me that this is the wrong thing to do.
> >
> > I am encouraged by the fact that the results from mcmcsamp correspond
> > closely to the correct theoretical results in the case that Peter
> > described.  I appreciate that some users will find it difficult to
> > work with a MCMC sample (or to convince editors to accept results
> > based on such a sample) but I think that these results indicate that
> > it is better to go after the marginal distribution of the fixed
> > effects estimates (which is what is being approximated by the MCMC
> > sample - up to Bayesian/frequentist philosophical differences) than to
> > use the conditional distribution and somehow try to adjust the
> > reference distribution.
>
> Am I right that the MCMC sample can not be used, however, to evaluate
> the significance of parameter groups. For example, to assess the
> significance of a three-level factor? Are there better alternatives than
> simply adjusting the CI for the number of factor levels
> (1-alpha/levels).

Hmm - I'm not sure what confidence interval and what number of levels
you mean there so I can't comment on that method.

Suppose we go back to Spencer's example and consider if there is a
signficant effect for the Nozzle factor.  That is equivalent to the
hypothesis H_0: beta_2 = beta_3 = 0 versus the general alternative.  A
"p-value" could be formulated from an MCMC sample if we assume that
the marginal distribution of the parameter estimates for beta_2 and
beta_3 has roughly elliptical contours and you can evaluate that by,
say, examining a hexbin plot of the values in the MCMC sample. One
could take the ellipses as defined by the standard errors and
estimated correlation or, probably better, by the observed standard
deviations and correlations in the MCMC sample.  Then determine the
proportion of (beta_2, beta_3) pairs in the sample that fall outside
the ellipse centered at the estimates and with that eccentricity and
scaling factors that passes through (0,0).  That would be an empirical
p-value for the test.

I would recommend calculating this for a couple of samples to check on
the reproducibility.