[R-sig-ME] Binomial glmer(): appropriateness of link and influential points

Fri Apr 23 04:28:49 CEST 2021

My comments, which were a bit off the cuff without looking at your queries
with all the care that was desirable, were designed to highlight issues with
binomial models.  Also, for checking purposes you want to plot partial
residuals against explanatory variables in turn.  As Ben suggests, plots
using DHARMa can be a good way to go.

Alternatives to fitting a mixed model are, in your case— a model with quasibinomial
error, or a betabinomial. A betabinomial using glmmTMB allows you to model the
scale parameter.  Those sorts of abilities are also available (and plots of  simulated
quantile residuals) in the gamlss package.  Which model is more appropriate will
depend on how the within subject component of variance (for the mixed model),
or the scale parameter varies (if at all) with the fitted value.

It is worth checking these alternatives.

John Maindonald             email: john.maindonald using anu.edu.au<mailto:john.maindonald using anu.edu.au>

On 23/04/2021, at 14:02, Hedyeh Ahmadi <hedyehah using usc.edu<mailto:hedyehah using usc.edu>> wrote:

Thank you for the DHARMa suggestion - I tried it but I am not sure how to interpret the plot from simulateResiduals(). I am getting the attached plot and I think this is pretty linear so is this a pass?

Best,

Hedyeh Ahmadi, Ph.D.
Statistician
Keck School of Medicine
Department of Preventive Medicine
University of Southern California

Postdoctoral Scholar
Institute for Interdisciplinary Salivary Bioscience Research (IISBR)
University of California, Irvine

LinkedIn
www.linkedin.com/in/hedyeh-ahmadi<http://www.linkedin.com/in/hedyeh-ahmadi><http://www.linkedin.com/in/hedyeh-ahmadi>
<http://www.linkedin.com/in/hedyeh-ahmadi><http://www.linkedin.com/in/hedyeh-ahmadi>

________________________________
From: Ben Bolker <bbolker using gmail.com<mailto:bbolker using gmail.com>>
Sent: Thursday, April 22, 2021 6:21 PM
To: Hedyeh Ahmadi <hedyehah using usc.edu<mailto:hedyehah using usc.edu>>; r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org> <r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org>>
Subject: Re: [R-sig-ME] Binomial glmer(): appropriateness of link and influential points

On 4/22/21 11:45 AM, Hedyeh Ahmadi wrote:
Thank you for the quick and informative reply.

1.   I was aware of using plot(fitted_model, type=c("p", "smooth")),
   but I wasn't sure this would be helpful with 0/1 outcome since the
   plot would be just two separate lines as follows - So do you think
   this technique is still appropriate?

 Try the DHARMa package, which uses simulated quantile residuals to
overcome this problem.

2. Yes, I have tried influence.merMod() and it takes way too long. So
   Cook's distance is still sppropriate for glmer() with binomial link?

  I would think so (to be honest, most of the advice about model
diagnostics is based on "this works for linear models and should work,
at least asymptotically, for GLM(M)s as well"

Best,

Hedyeh Ahmadi, Ph.D.
Statistician
Keck School of Medicine
Department of Preventive Medicine
University of Southern California

Postdoctoral Scholar
Institute for Interdisciplinary Salivary Bioscience Research (IISBR)
University of California, Irvine

LinkedIn
https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!7QzlJJ_apo0pfWsUyEQlBrvMwqxWXGwahNKQpyo9GM_AF8G8NAH4uDccjDXxfZA$ <https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!7QzlJJ_apo0pfWsUyEQlBrvMwqxWXGwahNKQpyo9GM_AF8G8NAH4uDccjDXxfZA$ >
<https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!7QzlJJ_apo0pfWsUyEQlBrvMwqxWXGwahNKQpyo9GM_AF8G8NAH4uDccjDXxfZA$ ><https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!7QzlJJ_apo0pfWsUyEQlBrvMwqxWXGwahNKQpyo9GM_AF8G8NAH4uDccjDXxfZA$ >

------------------------------------------------------------------------
*From:* R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org<mailto:r-sig-mixed-models-bounces using r-project.org>> on
behalf of Ben Bolker <bbolker using gmail.com<mailto:bbolker using gmail.com>>
*Sent:* Thursday, April 22, 2021 8:21 AM
*To:* r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org> <r-sig-mixed-models using r-project.org<mailto:r-sig-mixed-models using r-project.org>>
*Subject:* Re: [R-sig-ME] Binomial glmer(): appropriateness of link and
influential points

On 4/22/21 11:12 AM, Hedyeh Ahmadi wrote:
Hello all,
I have two questions regarding GLMM with binomial/logit link. Here are some information about my model/data before I ask my questions:

  *   My outcome is 0/1.
  *   I have continuous and categorical predictor.
  *   My data has 19000 rows with 2 observations per subject.
  *   My model only has one random intercept for each subject.
  *   I am using glmer() command in R.

My questions are as follows and any sample R code would be appreciated:

  1.  What's the best way to evaluate the appropriateness of my link function?
  2.  What's the best way to find influential points? Can I still use Cook's distance?
     *   If yes, with what package?
     *   What would be the rule of thumb for glmer() with binomial link for Cook's distance?

   An inappropriate link function will lead to nonlinearity of the
response on the linear-predictor scale, so the first thing to check is
the fitted vs. residual plot (with a smoothed line added so you can see
the trends): either

plot(fitted_model, type=c("p", "smooth"))

(maybe include pch="." since your data set is big)

or the analog via ggplot+broom.mixed: use broom.mixed::augment() to get
a data frame including .fitted and .resid, then plot it with
geom_point() and geom_smooth().

   There are "goodness-of-link" tests that might be generalizable to
GLMMs, but I'm not too familiar with them.

   2. There is an influence.merMod method for GLMM fits (it may be slow
for large data sets! You may want to set ncores>1). The 'car' package
has some additional functionality for plotting etc.

   I'm not sure about rules of thumb.

   If you are going to fit a mixed model with two binary observations
per cluster, you will be far from the range where PQL/Laplace/etc. are
going to be applicable; sonsider using nAGQ>1 to fit with Gauss-Hermite
quadrature.

Thank you in advance for your time.

Best,

Hedyeh Ahmadi, Ph.D.
Applied Statistician
Keck School of Medicine
Department of Preventive Medicine
University of Southern California

Postdoctoral Scholar
Institute for Interdisciplinary Salivary Bioscience Research (IISBR)
University of California, Irvine

LinkedIn
https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$
<https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$>
<https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$

<https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$
<https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$ >

      [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4m9HDgYw$
<https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4m9HDgYw$>

_______________________________________________
R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4m9HDgYw$
<https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4m9HDgYw$>

_______________________________________________
R-sig-mixed-models using r-project.org<mailto:R-sig-mixed-models using r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]