[R-sig-ME] Plotting residuals for GLMER model and zero counts

Wed Nov 6 22:33:20 CET 2013

Francesco <fbromano at ...> writes:

> Dear R-ers,
> 
> although a number of options exist out there to plot logit models, I
> can't seem
> to find one that works for glmer. What I would like to do is, in light of
> having found no interaction between the two fixed effects (which answers my
> research question), look at a plot to tell how well the model fits the data.
> 
> The package LMERconveniencefunctions ver 2.0
> has a nice plot for lmer but this won't work on my model because R
> automatically
> asks me to use glmer. It has a factor DV (Correct), two fixed factor IVs
> (Group and Syntax),
> and two random effect (ID and item).
> 
> Here's the output of calling:
> 
> model<- glmer(Correct ~ Group * Syntax + (Syntax + 1 | ID) + (Group + 1
> | item), data=...., family=binomial)

I was going to ask you to please generate a (small) reproducible example
(e.g. see http://tinyurl.com/reproducible-000 ), but I couldn't resist
the urge to show off a new feature of the development branch of lme4,
which is that you can use simulate() with a regular (g)lmer formula
to simulate data corresponding to a given mixed model ...

library(lme4)
set.seed(101)
## generate factorial combinations
d <- expand.grid(Group=c("ns","nns"),Syntax=c("'s","of"),
      ID =factor(1:38),item=factor(1:16))
## subsample randomly down to actual data size
d <- d[sample(nrow(d),size=451),]
## check ...
with(d,table(interaction(Group,Syntax)))
## parameters (approximated from your output)
params <- list(fixef=c(-9,-2,5,-1.5),theta=c(7,-1,7,3,-0.5,6))
form <- Correct ~ Group * Syntax + (Syntax + 1 | ID) + (Group + 1| item)
s <- simulate(form[-2],  ## use one-sided formula
    newdata=d,
    newparams=params,
    family=binomial)
## turn response into a factor
d$Correct <- factor(s[[1]],labels=c("InC","Cor"))
fit1 <- glmer(Correct ~ Group * Syntax +
    (Syntax + 1 | ID) + (Group + 1| item),
    data=d,
    family=binomial)
dAug <- data.frame(d,res=residuals(fit1,"pearson"))
library(lattice)
plot(fit1)  ## the default plot is not too useful
## or
bwplot(res~Group:Syntax,data=dAug)
plot(fit1,Group:Syntax~resid(.))
## how bad does the fit get if we leave out the interaction?
fit2 <- update(fit1,.~.-Group:Syntax)
plot(fit2,Group:Syntax~resid(.))
anova(fit1,fit2)
## you can also separate by item etc. pretty easily ...
plot(fit2,Group:Syntax~resid(.)|item)

> On a side note, how do I cope with cases where participants produced
> 0 counts to a cell?  I understand Chi-square does not cope well with
> these. Should I leave them as is, or turn these into NAs, or
> eliminate as outliers?

  Leave them alone.  You may be misunderstanding the framework of
GLMMs, which does *not* require a balanced design (although some
'structural' kinds of missingness, such as missing combinations
of interactions of *fixed* factors, will cause trouble) -- and
it's not based on a decomposition of a Pearson chi-squared table,
if that's what you're thinking ...