[R-sig-ME] extracting p values for main effects of binomial glmm

Thu Mar 5 21:08:41 CET 2015

An issue that has not been raised so far is whether the binomial or 
other GLM type variance takes account of all relevant observation 
level sources of variation.  This is pertinent for all generalised linear
mixed models.

Over-dispersed binomial or Poisson is in my experience much more
common (certainly, e.g. in such areas as ecology) than binomial or
Poisson.  The effect on the variance can be huge, multiplying by a
factor of 3 or 4 or more.  More generally, the multiplier may not be a
constant factor.  This issue is most serious for comparisons where
the relevant variances are the observation level variances.

With glmer(), one way to handle this is to fit an observation level
random effect; the multiplier is not then a constant factor.  glmer() did
at one time allow a constant multiplier.  I think it unfortunate that this
was removed, as it restricted the options available.   Using negative
binomial errors is in principle another possibility, but such models
can be difficult to get to converge.  If the model is wrong, it may well
not converge, which is an obstacle to getting to the point where one
has something half-sensible that does converge — this is the case 
for glm.nb() as well as for glmer.nb().

With predicted probabilities that are close to 0 or 1, or Poisson means 
that are close to 0, the Hauck-Donner effect where the standard errors
for Wald statistics become nonsense and z-statistics become smaller
as the distance between means that are to be compared increases,
is something more to worry about.

I guess the message is that this is treacherous territory, and it really
is necessary to know what one is doing!

I’d like to echo Jonathon Baron’s comments on the “main effects in
the presence of interactions issue."

John Maindonald             email: john.maindonald at anu.edu.au
Wellington, NZ

On 6/03/2015, at 0:25, Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:

> Hi,
> 
> As others have pointed out, the issue of obtaining p-values for effects is not trivial and there are pitfalls. Nevertheless I think that the easiest way to obtain what you want is to use mixed(..., method = "LRT") from package afex (full disclaimer: I am the main author of said package and function).
> 
> mixed() takes care of some issues, such as using appropriate coding and how to test main effects in the presence of interactions, in not uncontroversial but sensible ways (e.g., it does the latter by simply removing the main effect in a brute-force manner from the model-matrix and compare the full with the so reduced model). The way things are handled in mixed might not be correct in all cases but may provide exactly the combination of ease of use with sensible defaults that may be helpful in your case. In other words, it tries to adopt an approach often taken by commercial statistics packages.
> 
> You simply call mixed instead of glmer which fits and refits various versions of the model to provide likelihood-ratio tests for all effects as you want them. Note however, that mixed() cannot deal with the way you specify the model using cbind() as dependent variable. What you need to do is compute the proportions of "successes" and pass the number of observations as weights. Assuming your data resides in a data.frame df this would result in something along the following lines:
> 
> 
> df$prop_weights <- df$Offspring + df$Eggs_Offspring
> df$proportion_hatched <- df$Offspring/df$prop_weights
> 
> m1 <- mixed(proportion_hatched ~ Diet * Infection_status * Day + (1|SubjectID) + (1|Day), data = df, family=binomial, weights = df$proportion_weights, method = "LRT")
> 
> However, as Ken Beath has said, the main effects may not be sensible given interactions. To asses the degree of this threat for your data (which very much much depends on the type of interaction and their size) I suggest to follow the excellent advice of fortune(193) to "take the unusual step of plotting the data." This can be done nicely with e.g., the effects package, even for complicated interactions and binomial mixed models.
> 
> Hope this helps,
> Henrik
> 
> 
> Am 04.03.2015 um 21:11 schrieb Megan Kutzer:
>> Hi,
>> 
>> I'm fairly new to mixed models and have done a lot of reading without much
>> success. Unfortunately there is no one at my institution who is really
>> familiar with them so I thought I would try this list.
>> 
>> I'm running a binomial generalized linear mixed effects model and I need
>> p-values for the main effects. I know this isn't entirely correct with this
>> type of model but my supervisor wants the p-values!
>> 
>> The model is:
>> 
>> glmer (Proportion hatched ~ Diet * Infection status * Day + (1|SubjectID) +
>> (1|Day), family=binomial)
>> 
>> where,
>> 
>> Proportion hatched = cbind(Offspring, Eggs-Offspring)
>> Diet is a factor with 2 levels
>> Infection status is a factor with 4 levels
>> Day is a factor with 3 levels
>> 
>> Using Subject ID number and Day as random effects is supposed to control
>> for pseudoreplication in the model, although I am not entirely sure that
>> this is specified in the correct way. I wanted to include experimental
>> replicate here too but the model failed to converge.
>> 
>> My question is: is there a way to get p-values for the main fixed effects
>> of Diet, Infection and Day?
>> 
>> If you need more specific model information or the model output I would be
>> happy to provide it.
>> 
>> Thanks,
>> Megan
>> 
>> 	[[alternative HTML version deleted]]
>> 
> 
> -- 
> Dr. Henrik Singmann
> Universität Zürich, Schweiz
> http://singmann.org
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models