[R-sig-ME] How to fix a gamma model with poor fit?

Wed Jul 14 03:11:35 CEST 2021

A few quick comments:

1. The QQ-plot diagnostic needs to be against the appropriate
distribution. For a gamma model, the theoretical quantiles comes from
the gamma distribution, not the normal distribution.
2. You have a log link but your reaction-time data looks to be in
seconds, not milliseconds. Note that log10(0.1) = -1 and log10(1) = 0,
but log10(100) = 2 and log10(1000) = 3, so you'll get very different
answers for seconds vs. milliseconds. The reason why log transforms are
so nice for reaction times is not "just" the skew, but rather that there
is an underlying power law driving the effects *on the milliseconds* scale.

Are you using a gamma model because of the Lo and Andrews paper? I've
indicated my skepticism about that work previously, but these are my
critical points:

- there's still a transformation going on, it's just in the link function
- having a nonlinear link complicates interpretation of the coefficients
- gamma models are much harder to fit numerically (and I believe that
codepath is less well tested in lme4; it's a known problem in
MixedModels.jl)
- the usual tests on residuals, etc. now have to be done against a gamma
distribution, not a normal, but a lot of diagnostics use the normal by
default
- I don't understand the obsession with "satisfying normality
assumptions" (from their abstract) in a GLMM -- half the point of the
*generalized* bit is that you can swap in a different distributional
assumption (the other half is the use of a link function)

When thinking about using a model from a particular family/distribution,
note that the distributional assumption is *on the residuals* so the
skew in your raw data may or may not be present in the residuals. So
maybe you don't need a Gamma model at all.

Looking at your model output, it looks like you're using sum contrasts
-- great! But checkout contr.Sum from the car package for nicer looking
labels. :)

Phillip

On 13/7/21 6:18 pm, Cátia Ferreira De Oliveira via R-sig-mixed-models wrote:
> Dear all,
> 
> I am sorry for reposting here after posting on cross validated here
> <https://stats.stackexchange.com/questions/534098/glmer-gamma-model-good-fit>
> but I am still not sure what would be the best way of going about fixing
> this model. It seems to have poor fit if you look at the plots as they have
> extremes on both sides, which would not fit well with a gamma distribution.
> Despite this, the results are consistent across packages (lme4, nlme...).
> 
> I have 209062 rows of data and this is response time data.
> I want to determine whether there are differences between groups (Groups -
> 2 levels) on the learning of a task (Probability - 2 levels) across time
> (within sessions - Block - 4 levels / across sessions - Session - 2
> levels). It doesn't have zero response times, but some close to zero.
> 
> Do you have any suggestions for how one can improve a model like this or
> whether I should just use another distribution that fits the data a bit
> better?
> 
> Thank you!
> 
> Catia
> 
> 
> Model:
> 
>     glmer(RT ~ Prob * Bl * Session * Gr + (1  | Participant), data=
> Data.trimmed, family = Gamma(link =
> "log"), control=glmerControl(optimizer="bobyqa"))
> Model summary:
> 
>     Generalized linear mixed model fit by maximum likelihood (Adaptive
> Gauss-Hermite Quadrature, nAGQ = 0) ['glmerMod']
>      Family: Gamma  ( log )
>     Formula: RT ~ Probability * Block * Session * Group + (1 | Participant)
>        Data: Data.trimmed
>     Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun =
> 1e+06))
> 
>          AIC      BIC   logLik deviance df.resid
>      2456107  2456538 -1228012  2456023   209020
> 
>     Scaled residuals:
>        Min     1Q Median     3Q    Max
>     -4.297 -0.625 -0.158  0.440 35.691
> 
>     Random effects:
>      Groups      Name        Variance Std.Dev.
>      Participant (Intercept) 0.002203 0.04694
>      Residual                0.053481 0.23126
>     Number of obs: 209062, groups:  Participant, 130
> 
>     Fixed effects:
>                                             Estimate Std. Error  t value
> Pr(>|z|)
>     (Intercept)                            6.024e+00  4.182e-03 1440.439  <
> 2e-16 ***
>     Probability1                          -2.835e-02  7.041e-04  -40.265  <
> 2e-16 ***
>     Block2-1                              -2.925e-02  2.077e-03  -14.084  <
> 2e-16 ***
>     Block3-2                              -3.676e-03  2.131e-03   -1.725
> 0.084500 .
>     Block4-3                               4.085e-03  2.307e-03    1.771
> 0.076577 .
>     Block5-4                              -1.220e-02  2.380e-03   -5.125
> 2.98e-07 ***
>     Session1                               4.795e-02  7.323e-04   65.478  <
> 2e-16 ***
>     Group1                                -4.616e-02  4.182e-03  -11.037  <
> 2e-16 ***
>     Probability1:Block2-1                 -7.228e-03  2.077e-03   -3.480
> 0.000501 ***
>     Probability1:Block3-2                 -5.332e-03  2.131e-03   -2.503
> 0.012331 *
>     Probability1:Block4-3                 -2.076e-02  2.307e-03   -8.999  <
> 2e-16 ***
>     Probability1:Block5-4                  6.044e-03  2.380e-03    2.539
> 0.011104 *
>     Probability1:Session1                  1.656e-03  7.046e-04    2.351
> 0.018743 *
>     Block2-1:Session1                     -1.972e-02  2.077e-03   -9.494  <
> 2e-16 ***
>     Block3-2:Session1                     -8.521e-03  2.131e-03   -3.999
> 6.35e-05 ***
>     Block4-3:Session1                      4.380e-05  2.308e-03    0.019
> 0.984856
>     Block5-4:Session1                     -3.768e-03  2.380e-03   -1.583
> 0.113389
>     Probability1:Group1                    1.515e-03  7.041e-04    2.151
> 0.031478 *
>     Block2-1:Group1                       -6.161e-03  2.077e-03   -2.966
> 0.003015 **
>     Block3-2:Group1                       -1.129e-02  2.131e-03   -5.301
> 1.15e-07 ***
>     Block4-3:Group1                        7.095e-03  2.307e-03    3.076
> 0.002101 **
>     Block5-4:Group1                       -4.055e-03  2.380e-03   -1.704
> 0.088414 .
>     Session1:Group1                        3.782e-03  7.323e-04    5.164
> 2.41e-07 ***
>     Probability1:Block2-1:Session1         5.729e-05  2.077e-03    0.028
> 0.977997
>     Probability1:Block3-2:Session1         3.543e-03  2.131e-03    1.663
> 0.096363 .
>     Probability1:Block4-3:Session1        -6.877e-03  2.308e-03   -2.980
> 0.002886 **
>     Probability1:Block5-4:Session1         4.329e-03  2.380e-03    1.819
> 0.068952 .
>     Probability1:Block2-1:Group1          -1.238e-03  2.077e-03   -0.596
> 0.550980
>     Probability1:Block3-2:Group1           1.022e-02  2.131e-03    4.795
> 1.63e-06 ***
>     Probability1:Block4-3:Group1          -6.532e-03  2.307e-03   -2.831
> 0.004634 **
>     Probability1:Block5-4:Group1           2.351e-03  2.380e-03    0.988
> 0.323373
>     Probability1:Session1:Group1          -1.805e-03  7.046e-04   -2.562
> 0.010412 *
>     Block2-1:Session1:Group1              -2.060e-04  2.077e-03   -0.099
> 0.920984
>     Block3-2:Session1:Group1              -4.211e-03  2.131e-03   -1.977
> 0.048094 *
>     Block4-3:Session1:Group1               3.339e-03  2.308e-03    1.447
> 0.147888
>     Block5-4:Session1:Group1              -3.956e-03  2.380e-03   -1.662
> 0.096539 .
>     Probability1:Block2-1:Session1:Group1 -1.270e-03  2.077e-03   -0.611
> 0.540933
>     Probability1:Block3-2:Session1:Group1  1.678e-03  2.131e-03    0.788
> 0.430929
>     Probability1:Block4-3:Session1:Group1 -4.640e-03  2.308e-03   -2.010
> 0.044392 *
>     Probability1:Block5-4:Session1:Group1  4.714e-03  2.380e-03    1.980
> 0.047649 *
>     ---
>     Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
>     Correlation matrix not shown by default, as p = 40 > 12.
>     Use print(x, correlation=TRUE)  or
>         vcov(x)        if you need it
> 
> Plots:
> 
>   [1]: https://i.stack.imgur.com/XPdtl.png
>   [2]: https://i.stack.imgur.com/zUNRX.png
>   [3]: https://i.stack.imgur.com/6slYG.png
>   [4]: https://i.stack.imgur.com/LlRwT.png
>   [5]: https://i.stack.imgur.com/TNYCP.png
>   [6]: https://i.stack.imgur.com/45l0P.png
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>