[R-sig-ME] Binomial glmer(): appropriateness of link and influential points

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Thu Apr 22 17:21:04 CEST 2021

On 4/22/21 11:12 AM, Hedyeh Ahmadi wrote:
> Hello all,
> I have two questions regarding GLMM with binomial/logit link. Here are some information about my model/data before I ask my questions:
>    *   My outcome is 0/1.
>    *   I have continuous and categorical predictor.
>    *   My data has 19000 rows with 2 observations per subject.
>    *   My model only has one random intercept for each subject.
>    *   I am using glmer() command in R.
> My questions are as follows and any sample R code would be appreciated:
>    1.  What's the best way to evaluate the appropriateness of my link function?
>    2.  What's the best way to find influential points? Can I still use Cook's distance?
>       *   If yes, with what package?
>       *   What would be the rule of thumb for glmer() with binomial link for Cook's distance?

   An inappropriate link function will lead to nonlinearity of the 
response on the linear-predictor scale, so the first thing to check is 
the fitted vs. residual plot (with a smoothed line added so you can see 
the trends): either

plot(fitted_model, type=c("p", "smooth"))

(maybe include pch="." since your data set is big)

or the analog via ggplot+broom.mixed: use broom.mixed::augment() to get 
a data frame including .fitted and .resid, then plot it with 
geom_point() and geom_smooth().

   There are "goodness-of-link" tests that might be generalizable to 
GLMMs, but I'm not too familiar with them.

   2. There is an influence.merMod method for GLMM fits (it may be slow 
for large data sets! You may want to set ncores>1). The 'car' package 
has some additional functionality for plotting etc.

   I'm not sure about rules of thumb.

   If you are going to fit a mixed model with two binary observations 
per cluster, you will be far from the range where PQL/Laplace/etc. are 
going to be applicable; sonsider using nAGQ>1 to fit with Gauss-Hermite 

> Thank you in advance for your time.
> Best,
> Hedyeh Ahmadi, Ph.D.
> Applied Statistician
> Keck School of Medicine
> Department of Preventive Medicine
> University of Southern California
> Postdoctoral Scholar
> Institute for Interdisciplinary Salivary Bioscience Research (IISBR)
> University of California, Irvine
> LinkedIn
> www.linkedin.com/in/hedyeh-ahmadi<http://www.linkedin.com/in/hedyeh-ahmadi>
> <http://www.linkedin.com/in/hedyeh-ahmadi><http://www.linkedin.com/in/hedyeh-ahmadi>
> 	[[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list