[R-sig-ME] likelihood-ratio tests in conflict with coefficiants in maximal random effect model

Fri Mar 7 18:13:46 CET 2014

On Mar 7, 2014, at 6:21 AM, Shravan Vasishth <vasishth.shravan at gmail.com> wrote:

> Hi Roger and Emilia, and others,
> 
> I just wanted to say that in Emilia's data, she has 36 subjects and 20
> items. Roger, would you agree that it is very difficult with this amount of
> data to accurately estimate the full variance-covariance matrices for
> subjects and for items random effects, especially the correlation
> parameters? The numbers that lmer returns, for such sizes of data, are
> pretty wild estimates, and often have no bearing to the true underlying
> correlations. I think that in this situation we might be asking too much
> from lmer, without giving it enough data. If, on the other hand, we have a
> lot of data by subjects and items, it becomes possible to estimate these
> parameters.
> 
> I believe this may have been, at least partly, the intent of Douglas Bates'
> original message about overparameterization.

That’s a good question.  I imagine there is a fair bit of uncertainty regarding the correlation parameters, though I would guess that it’s not huge for this-sized dataset. The point estimates that lme4(.0) give us don’t quantify this uncertainty, but of course we could use Bayesian methods to get a better sense of them.

More generally, this point that you raise, Shravan, is precisely the reason that I tend to favor likelihood-ratio tests over the t-statistic for the purposes of confirmatory hypothesis tests like Emilia’s.  As Baayen, Davidson and Bates (2008, page 396) crucially point out, the t-statistic is computed conditional on a point estimate of the random-effects covariance matrix, and fails to take into account uncertainty in the estimate of this matrix.  The likelihood ratio does not have this problem.  (It has other problems — namely that the log likelihood ratio is not truly chi-squared distributed — but with 20 items and 36 subjects in a balanced design I would expect that the chi-squared approximation is fairly close.  And at any rate, the same problem exists with the t statistic.)

So my take is that how much we should worry about these issues depends in part on our modeling goals.  For a confirmatory hypothesis test like Emilia’s on her dataset, I wouldn’t worry much about overparameterization for the models she was showing us.  If she wanted to aggressively interpret the parameter estimates resulting from a particular model fit, on the other hand, I would be much more cautious.

Best

Roger