[R-sig-ME] Fw: glmer and nAGQ

Fri Jun 18 23:44:09 CEST 2021

   This is kind of hard to read, but I'll try.

   First of all, log-likelihoods with nAGQ>1 (adaptive Gauss-Hermite 
quadrature) **are not currently commensurate with log-likelihoods with 
nAGQ==1 (Laplace approximation). This is hinted at in the "Deviance and 
log-likelihood of GLMMs" section under "Deviance and log-likelihood of 
GLMMs".

   In any case, the log-likelihoods computed under different numbers of 
quadrature points aren't comparable as though they were different 
statistical models (AIC, likelihood ratio tests, etc.). Rather, they are 
based on **different approximations** to the same model, and we know 
that larger numbers of quadrature points (increasing nAGQ) are *more 
accurate* (although slower) approximations.  The main thing is to see if 
*results* (parameter estimates etc.) change as nAGQ increases by a 
magnitude that is important for the current analysis ... if they do, 
then you should use an nAGQ that is large enough that it approximates 
"infinity", i.e. increasing nAGQ further doesn't change the answers by 
an important amount.

   In general if it is not important to you to use the inverse link (for 
interpretation/because you think this will be the natural scale on which 
to measure effects of continuous parameters and/or interactions), 
Gamma(link="log") is generally more robust.

   The "rescale variables?" suggestion is just that; if the parameters 
are already rescaled, then it's not going to help.  A large eigenvalue 
may not actually be a problem, it just indicates the *possibility* of 
numerical instability. In this case my guess is that the very small 
random effects variance might be responsible.

  It's also a little surprising that the normRain coefficient is very 
small in magnitude and yet but has a very large Z-statistic, given this 
sample size.

  The results with increasing AGQ do indeed look kind of wonky.  If this 
were my problem I would want to look more carefully at the data/think 
about where they came from.  How do the model diagnostics look?

On 6/18/21 3:52 PM, Peter R Law via R-sig-mixed-models wrote:
> I am re-sending this query that I originally emailed June 13'th. I did not receive a copy of the sent email as I have with previous postings and my query doesn't appear in the archive of postings so does not seem to have been received.
> 
> Peter R Law
> 
> Sent with [ProtonMail](https://protonmail.com/) Secure Email.
> 
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Monday, June 14th, 2021 at 10:48 PM, Peter R Law <prldb using protonmail.com> wrote:
> 
>> Any help with the following query is much appreciated.
>>
>> I used some simulated data (not generated under any specific distributional assumption but all responses are positive quantities) to investigate the nAGQ argument in glmer, running a Gamma-distribution model. With nAAGQ=2 the logLik is dramatically different to the default value of nAGQ=1, while nAGQ=5 returned minus infinity for the logLik, but the estimates of the fixed effect parameters are somewhat consistent across each computation. Are the differences in the estimated logLik surprising or do they reflect the warnings glmer returns for this attempted modelling? I got similar results for a real dataset too.
>>
>> data.frame':500 obs. of8 variables:
>>
>> $ IBI: num25.5 25.4 25.2 25.6 25.8 ...
>>
>> $ MatID: Factor w/ 99 levels "M1","M10","M11",..: 1 1 1 1 1 12 12 12 12 12 ...
>>
>> $ Pop: Factor w/ 5 levels "P1","P2","P3",..: 1 1 1 1 1 1 1 1 1 1 ...
>>
>> $ density : int11 13 15 19 28 11 13 15 19 28 ...
>>
>> $ rain: num41.1 36.6 31.6 40 40.6 ...
>>
>> $ normIBI : num-1.28 -1.29 -1.31 -1.26 -1.24 ...
>>
>> $ normDens: num-1.72 -1.66 -1.61 -1.49 -1.23 ...
>>
>> $ normRain: num-0.249 -0.64 -1.073 -0.345 -0.287 ...
>>
>>> M61 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, data=Sim)
>>
>> Warning message:
>>
>> In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,:
>>
>> Model is nearly unidentifiable: very large eigenvalue
>>
>> - Rescale variables?
>>
>>> summary(M61)
>>
>> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
>>
>> Family: Gamma( inverse )
>>
>> Formula: IBI ~ normDens + normRain + (1 | MatID)
>>
>> Data: Sim
>>
>> AICBIClogLik deviance df.resid
>>
>> 1297.61318.7-643.81287.6495
>>
>> Scaled residuals:
>>
>> Min1QMedian3QMax
>>
>> -1.58054 -0.306190.041690.367441.31012
>>
>> Random effects:
>>
>> GroupsNameVarianceStd.Dev.
>>
>> MatID(Intercept) 5.802e-06 0.002409
>>
>> Residual1.368e-03 0.036989
>>
>> Number of obs: 500, groups:MatID, 99
>>
>> Fixed effects:
>>
>> Estimate Std. Error t value Pr(>|z|)
>>
>> (Intercept)3.083e-026.090e-0450.63<2e-16 ***
>>
>> normDens -1.720e-037.946e-05-21.65<2e-16 ***
>>
>> normRain4.889e-042.463e-0519.85<2e-16 ***
>>
>> ---
>>
>> Signif. codes:0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> Correlation of Fixed Effects:
>>
>> (Intr) nrmDns
>>
>> normDens0.028
>>
>> normRain0.000 -0.044
>>
>> convergence code: 0
>>
>> Model is nearly unidentifiable: very large eigenvalue
>>
>> - Rescale variables?
>>
>> What kind of scaling is being suggested in the case of the default value of nAGQ? The predictors are already normalized.
>>
>>> M62 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=2,data=Sim)
>>
>> boundary (singular) fit: see ?isSingular
>>
>>> summary(M62)
>>
>> Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 2) ['glmerMod']
>>
>> Family: Gamma( inverse )
>>
>> Formula: IBI ~ normDens + normRain + (1 | MatID)
>>
>> Data: Sim
>>
>> AICBIClogLik deviance df.resid
>>
>> 21.742.8-5.911.7495
>>
>> Scaled residuals:
>>
>> Min1QMedian3QMax
>>
>> -2.3532 -0.6617 -0.13050.51013.4180
>>
>> Random effects:
>>
>> GroupsNameVariance Std.Dev.
>>
>> MatID(Intercept) 0.000000.0000
>>
>> Residual0.024160.1554
>>
>> Number of obs: 500, groups:MatID, 99
>>
>> Fixed effects:
>>
>> Estimate Std. Error t value Pr(>|z|)
>>
>> (Intercept)0.02882820.001300822.161< 2e-16 ***
>>
>> normDens-0.00392740.0012153-3.2320.00123 **
>>
>> normRain0.00027830.00129940.2140.83042
>>
>> ---
>>
>> Signif. codes:0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> Correlation of Fixed Effects:
>>
>> (Intr) nrmDns
>>
>> normDens -0.275
>>
>> normRain0.019 -0.019
>>
>> convergence code: 0
>>
>> boundary (singular) fit: see ?isSingular
>>
>>> M65 <- glmer(IBI~normDens+normRain + (1|MatID), family=Gamma, nAGQ=5,data=Sim)
>>
>>> summary(M65)
>>
>> Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 5) ['glmerMod']
>>
>> Family: Gamma( inverse )
>>
>> Formula: IBI ~ normDens + normRain + (1 | MatID)
>>
>> Data: Sim
>>
>> AICBIClogLik deviance df.resid
>>
>> InfInf-InfInf495
>>
>> Scaled residuals:
>>
>> Min1QMedian3QMax
>>
>> -2.2488 -0.39020.03160.34851.7127
>>
>> Random effects:
>>
>> GroupsNameVarianceStd.Dev.
>>
>> MatID(Intercept) 6.007e-06 0.002451
>>
>> Residual8.831e-04 0.029716
>>
>> Number of obs: 500, groups:MatID, 99
>>
>> Fixed effects:
>>
>> Estimate Std. Error t value Pr(>|z|)
>>
>> (Intercept)2.936e-022.495e-04117.68<2e-16 ***
>>
>> normDens-2.182e-031.163e-04-18.76<2e-16 ***
>>
>> normRain4.883e-043.973e-0512.29<2e-16 ***
>>
>> ---
>>
>> Signif. codes:0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> Correlation of Fixed Effects:
>>
>> (Intr) nrmDns
>>
>> normDens -0.013
>>
>> normRain0.004 -0.041
>>
>> convergence code: 0
>>
>> Gradient contains NAs
>>
>> Warning messages:
>>
>> 1: In vcov.merMod(object, use.hessian = use.hessian) :
>>
>> variance-covariance matrix computed from finite-difference Hessian is
>>
>> not positive definite or contains NA values: falling back to var-cov estimated from RX
>>
>> 2: In vcov.merMod(object, correlation = correlation, sigm = sig) :
>>
>> variance-covariance matrix computed from finite-difference Hessian is
>>
>> not positive definite or contains NA values: falling back to var-cov estimated from RX
>>
>> For comparison, the linear model seems to be well behaved:
>>
>> M1 <- lmer(IBI~normDens+normRain +(1|MatID), REML=FALSE, data=Sim)
>>
>>>
>>
>> summary(M1)
>>
>> Linear mixed model fit by maximum likelihood
>>
>> ['lmerMod']
>>
>> Formula: IBI ~ normDens + normRain + (1 | MatID)
>>
>> Data: Sim
>>
>> AIC
>>
>> BIC
>>
>> logLik deviance df.resid
>>
>> 1804.2
>>
>> 1825.2
>>
>> -897.1
>>
>> 1794.2
>>
>> 495
>>
>> Scaled residuals:
>>
>> Min
>>
>> 1Q
>>
>> Median
>>
>> 3Q
>>
>> Max
>>
>> -3.2970 -0.5304
>>
>> 0.0153
>>
>> 0.5484
>>
>> 3.4798
>>
>> Random effects:
>>
>> Groups
>>
>> Name
>>
>> Variance Std.Dev.
>>
>> MatID
>>
>> (Intercept) 43.5694
>>
>> 6.6007
>>
>> Residual
>>
>> 0.6737
>>
>> 0.8208
>>
>> Number of obs: 500, groups:
>>
>> MatID, 99
>>
>> Fixed effects:
>>
>> Estimate Std. Error t value
>>
>> (Intercept) 35.47121
>>
>> 0.66443
>>
>> 53.39
>>
>> normDens
>>
>> 2.00732
>>
>> 0.11816
>>
>> 16.99
>>
>> normRain
>>
>> -0.71747
>>
>> 0.04237
>>
>> -16.93
>>
>> Correlation of Fixed Effects:
>>
>> (Intr) nrmDns
>>
>> normDens -0.002
>>
>> normRain
>>
>> 0.000 -0.044
>>
>> Should I just conclude that the data is not well modelled by a Gamma GLMM?
>>
>> Peter R Law
>> Research Associate
>> Center for African Conservation Ecology
>> Nelson Mandela University
>> South Africa
>>
>> Sent with [ProtonMail](https://protonmail.com/) Secure Email.
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>