# [R-sig-ME] glmer.nb warning and prediction problem

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Fri Mar 4 21:02:00 CET 2022

```  (1) tl;dr don't worry about it. It's very, very likely that you have
data whose conditional distribution is indistinguishable from a
Poisson. I know that the *marginal* variance/mean ratio is large, but
once we put in the covariates and the random effects, what's left is
presumably equi- or underdispersed (variance ~ mean or variance <
mean). The mean counts at the beginning of the time series are about 8
(exp(2)), decreasing over time; your dispersion parameter is 798, >>
mean - that means the distribution is effectively Poisson. The
iteration limit warning comes from the initial call to MASS::glm.nb,
which tries to determine an initial guess for the dispersion parameter
via an iterative algorithm - it gives up after a while.
You could try a Poisson model - my guess is that it will give
practically indistinguishable results (and a likelihood ratio test
probably won't reject the null hypothesis that the conditional
distribution is Poisson).
Or you could ignore the warning.

2. You might need to put the levels of `Dag` and `Dec_Hour` in
quotation marks.  What is `lapply(model.frame(mod5), unique)` ?

On Fri, Mar 4, 2022 at 7:02 AM Adriaan de Jong <Adriaan.de.Jong using slu.se> wrote:
>
> Dear list members,
>
> What I’m trying to do is to model a linear trend over the years (Years = Year – 1992, 1993 was the starting year of the 29 year data series) of bird numbers (“Total”) with a var/mean ratio of 2.85. The variables “Dag” (= day of the study period starting on April 1 = 1, in this subset ranging from 31 to 80) and “Dec_Hour” (= decimal hour between 4 AM and 9 PM) are nested random effects. The subset for this analysis (Trend1) contains 993 rows (=counting trips).
>
> The mod5<-glmer.nb(Total~  Years + (1|Dag/Dec_Hour),data=Trend1) command under lme4 gives me the following warning message:
>
> In theta.ml(Y, mu, weights = object using resp\$weights, limit = limit,  :
>   iteration limit reached
>
> I presume I need to provide glmer.nb with a lmerControl string, but have no clue what it should contain.
>
> I guess that the Dec_Hour may be causing the iteration limit problem. The model seems to converge, though, and the summary looks like this
>
>
> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [
> glmerMod]
>  Family: Negative Binomial(798.6507)  ( log )
> Formula: Total ~ Years + (1 | Dag/Dec_Hour)
>    Data: Trend1
>
>      AIC      BIC   logLik deviance df.resid
>   4921.7   4946.2  -2455.8   4911.7      988
>
> Scaled residuals:
>     Min      1Q  Median      3Q     Max
> -1.9784 -0.6199 -0.0885  0.4511  3.8851
>
> Random effects:
>  Groups       Name        Variance Std.Dev.
>  Dec_Hour:Dag (Intercept) 0.15182  0.3896
>  Dag          (Intercept) 0.09292  0.3048
> Number of obs: 993, groups:  Dec_Hour:Dag, 945; Dag, 50
>
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)  2.09539    0.06025   34.78   <2e-16 ***
> Years       -0.04078    0.00248  -16.44   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Correlation of Fixed Effects:
>       (Intr)
> Years -0.594
>
> This makes sense to me, but I would like to predict the 1993 and 2021 endpoint "Total" values of the yearly trend (numerically Years = 1 and 29) given the peak values of observed birds found to be Dag=34 and Dec_Hour=7. For this I tried:
>
> newd<-data.frame(Years=c(1,29),Dag=34,Dec_Hour=7.0)
> pred<-predict(mod5,newd)
>
> This rendered the message:
> Error in levelfun(r, n, allow.new.levels = allow.new.levels) :
>   new levels detected in newdata
>
> Could anyone give me a hint on (a) how to avoid the iteration problem and (b) how to adjust the predict function? Thanks in advance for your help.
>
> Cheers,
>