[R-sig-ME] Binomial glmer(): appropriateness of link and influential points

Fri Apr 23 04:02:00 CEST 2021

Thank you for the DHARMa suggestion - I tried it but I am not sure how to interpret the plot from simulateResiduals(). I am getting the attached plot and I think this is pretty linear so is this a pass?

Best,

Hedyeh Ahmadi, Ph.D.
Statistician
Keck School of Medicine
Department of Preventive Medicine
University of Southern California

Postdoctoral Scholar
Institute for Interdisciplinary Salivary Bioscience Research (IISBR)
University of California, Irvine

LinkedIn
www.linkedin.com/in/hedyeh-ahmadi<http://www.linkedin.com/in/hedyeh-ahmadi>
<http://www.linkedin.com/in/hedyeh-ahmadi><http://www.linkedin.com/in/hedyeh-ahmadi>

________________________________
From: Ben Bolker <bbolker using gmail.com>
Sent: Thursday, April 22, 2021 6:21 PM
To: Hedyeh Ahmadi <hedyehah using usc.edu>; r-sig-mixed-models using r-project.org <r-sig-mixed-models using r-project.org>
Subject: Re: [R-sig-ME] Binomial glmer(): appropriateness of link and influential points

On 4/22/21 11:45 AM, Hedyeh Ahmadi wrote:
> Thank you for the quick and informative reply.
>
>  1.   I was aware of using plot(fitted_model, type=c("p", "smooth")),
>     but I wasn't sure this would be helpful with 0/1 outcome since the
>     plot would be just two separate lines as follows - So do you think
>     this technique is still appropriate?

  Try the DHARMa package, which uses simulated quantile residuals to
overcome this problem.
>
>
>
>
>  2. Yes, I have tried influence.merMod() and it takes way too long. So
>     Cook's distance is still sppropriate for glmer() with binomial link?

   I would think so (to be honest, most of the advice about model
diagnostics is based on "this works for linear models and should work,
at least asymptotically, for GLM(M)s as well"

>
> Best,
>
> Hedyeh Ahmadi, Ph.D.
> Statistician
> Keck School of Medicine
> Department of Preventive Medicine
> University of Southern California
>
> Postdoctoral Scholar
> Institute for Interdisciplinary Salivary Bioscience Research (IISBR)
> University of California, Irvine
>
> LinkedIn
> https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!7QzlJJ_apo0pfWsUyEQlBrvMwqxWXGwahNKQpyo9GM_AF8G8NAH4uDccjDXxfZA$  <https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!7QzlJJ_apo0pfWsUyEQlBrvMwqxWXGwahNKQpyo9GM_AF8G8NAH4uDccjDXxfZA$ >
> <https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!7QzlJJ_apo0pfWsUyEQlBrvMwqxWXGwahNKQpyo9GM_AF8G8NAH4uDccjDXxfZA$ ><https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!7QzlJJ_apo0pfWsUyEQlBrvMwqxWXGwahNKQpyo9GM_AF8G8NAH4uDccjDXxfZA$ >
>
>
>
>
> ------------------------------------------------------------------------
> *From:* R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> on
> behalf of Ben Bolker <bbolker using gmail.com>
> *Sent:* Thursday, April 22, 2021 8:21 AM
> *To:* r-sig-mixed-models using r-project.org <r-sig-mixed-models using r-project.org>
> *Subject:* Re: [R-sig-ME] Binomial glmer(): appropriateness of link and
> influential points
>
>
> On 4/22/21 11:12 AM, Hedyeh Ahmadi wrote:
>> Hello all,
>> I have two questions regarding GLMM with binomial/logit link. Here are some information about my model/data before I ask my questions:
>>
>>    *   My outcome is 0/1.
>>    *   I have continuous and categorical predictor.
>>    *   My data has 19000 rows with 2 observations per subject.
>>    *   My model only has one random intercept for each subject.
>>    *   I am using glmer() command in R.
>>
>> My questions are as follows and any sample R code would be appreciated:
>>
>>    1.  What's the best way to evaluate the appropriateness of my link function?
>>    2.  What's the best way to find influential points? Can I still use Cook's distance?
>>       *   If yes, with what package?
>>       *   What would be the rule of thumb for glmer() with binomial link for Cook's distance?
>>
>
>     An inappropriate link function will lead to nonlinearity of the
> response on the linear-predictor scale, so the first thing to check is
> the fitted vs. residual plot (with a smoothed line added so you can see
> the trends): either
>
> plot(fitted_model, type=c("p", "smooth"))
>
> (maybe include pch="." since your data set is big)
>
> or the analog via ggplot+broom.mixed: use broom.mixed::augment() to get
> a data frame including .fitted and .resid, then plot it with
> geom_point() and geom_smooth().
>
>     There are "goodness-of-link" tests that might be generalizable to
> GLMMs, but I'm not too familiar with them.
>
>     2. There is an influence.merMod method for GLMM fits (it may be slow
> for large data sets! You may want to set ncores>1). The 'car' package
> has some additional functionality for plotting etc.
>
>     I'm not sure about rules of thumb.
>
>     If you are going to fit a mixed model with two binary observations
> per cluster, you will be far from the range where PQL/Laplace/etc. are
> going to be applicable; sonsider using nAGQ>1 to fit with Gauss-Hermite
> quadrature.
>
>> Thank you in advance for your time.
>>
>> Best,
>>
>> Hedyeh Ahmadi, Ph.D.
>> Applied Statistician
>> Keck School of Medicine
>> Department of Preventive Medicine
>> University of Southern California
>>
>> Postdoctoral Scholar
>> Institute for Interdisciplinary Salivary Bioscience Research (IISBR)
>> University of California, Irvine
>>
>> LinkedIn
>> https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$
> <https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$>
> <https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$
>  >
>> <https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$
>  ><https://urldefense.com/v3/__http://www.linkedin.com/in/hedyeh-ahmadi__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4cS-_Nus$ >
>>
>>
>>
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4m9HDgYw$
> <https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4m9HDgYw$>
>
>>
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4m9HDgYw$
> <https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models__;!!LIr3w8kk_Xxm!5CAcAPB0aDcXfMTpzFhHXeJ2eDwVdhX2DJEP5cx9Y_4GNT5qtAwjVCB4m9HDgYw$>
>