[R-sig-ME] specifying crossed random effects for glmmPQL / lme

Mon Oct 2 02:25:58 CEST 2017

  There are a few issues here: see comments inline.

On 17-09-26 05:03 PM, Van Rynald Liceralde wrote:
> Hello,
> 
> I'm trying to fit a GLMM on simulated response time data (continuous,
> positively skewed) obtained from hypothetical participants (Subject)
> responding to the same set of hypothetical items (Item), so it's a
> fully-crossed design. I intend to include several crossed-random effects
> for Subject and Item, so in lme4 language, it would look like the following:
> 
> glmer(y ~ x1*x2*z1 + (1+x1+x2|Subject) + (1|Item),
> family=Gamma("identity"), data=foo)

   I've seen the arguments that say that one should use a Gamma with
identity link for response time data; I didn't find them 100%
convincing, but whatever (can someone remind me of the reference?)
Nevertheless, be aware that fitting models where the link function
doesn't constrain the predicted value to to the domain of the specified
probability distribution (e.g. Gamma/inverse, Gamma/identity,
binomial/identity ...) is much more likely to be computationally
problematic.

> However, as I read from Ben Bolker's GLMM FAQ (
> https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#fn1), the
> estimation procedure used by glmer (adaptive Gauss-Hermite quadrature) can
> only handle up to 2-3 random effects. Indeed, running glmer on my simulated
> data not only results in inevitable non-convergence but also takes such a
> long time to run.

  AGHQ is not glmer's default; Laplace (equivalently, AGHQ with a single
quadrature point) is.
> Someone recommended to me to use MASS::glmmPQL instead because the cases in
> which penalized quasi-likelihood (PQL) would perform poorly (count/binomial
> DV, mean DV < 5) doesn't apply to my data (continuous DV, identity link,
> many items, and many subjects). Moreover, PQL could handle more random
> effects than GHQ; it could also allow for correlations of random effects to
> be estimated; and it estimates the model faster than GHQ. (I don't actually
> know about any of those being accurate characterizations of PQL and GHQ;
> would be happy to be corrected and pointed to the right direction.)

  The underlying characteristic for whether glmmPQL works well is how
close the sampling distributions of the conditional modes are to being
Gaussian. This generally fails badly in settings where there is little
information on each cluster, which is true for low-count data; I'm not
quite sure how "information per observation" maps onto the Gamma
distribution, although very small shape parameters/skewed distributions
would probably be worse than approximately Normal responses. If you have
many items per subject you're probably OK.

 It is certainly true that where it is sufficiently accurate, PQL is
faster than Laplace or AGHQ.  I'm not sure what you mean by "also allow
for correlations of random effects to be estimated" ...

> 
> The solution suggested online on CrossValidated is as follows:
> 
>> bar <- glmmPQL(y ~ x1*x2*z1, random=list(Subject=~1+x1+x2, Item=~1),
>                  family=Gamma("identity"),data=foo)
> 
> but this way of doing it seems to model the random effect for Item as if it
> was nested under Subject, but I want them to be identified as crossed. I
> was wondering if someone can point me to how I'd be able to specify my
> model using glmmPQL such that the effects of Subject and Item are truly
> crossed. Thank you so much!

   Unfortunately crossed effects are rather challenging to implement in
nlme (the platform underlying glmmPQL). There is one example in one of
the later chapters of Pinheiro and Bates (2000), but I'm not in a
position to look it up right now ...

> 
> Sincerely,
> Van Liceralde
>