[R-sig-ME] glmer.nb warning and prediction problem

Fri Mar 4 21:02:27 CET 2022

  (It wouldn't hurt for us to make the error message slightly more
informative by listing *which* new levels were found ...)

On Fri, Mar 4, 2022 at 3:02 PM Ben Bolker <bbolker using gmail.com> wrote:
>
>   (1) tl;dr don't worry about it. It's very, very likely that you have
> data whose conditional distribution is indistinguishable from a
> Poisson. I know that the *marginal* variance/mean ratio is large, but
> once we put in the covariates and the random effects, what's left is
> presumably equi- or underdispersed (variance ~ mean or variance <
> mean). The mean counts at the beginning of the time series are about 8
> (exp(2)), decreasing over time; your dispersion parameter is 798, >>
> mean - that means the distribution is effectively Poisson. The
> iteration limit warning comes from the initial call to MASS::glm.nb,
> which tries to determine an initial guess for the dispersion parameter
> via an iterative algorithm - it gives up after a while.
>    You could try a Poisson model - my guess is that it will give
> practically indistinguishable results (and a likelihood ratio test
> probably won't reject the null hypothesis that the conditional
> distribution is Poisson).
>    Or you could ignore the warning.
>
> 2. You might need to put the levels of `Dag` and `Dec_Hour` in
> quotation marks.  What is `lapply(model.frame(mod5), unique)` ?
>
>
> On Fri, Mar 4, 2022 at 7:02 AM Adriaan de Jong <Adriaan.de.Jong using slu.se> wrote:
> >
> > Dear list members,
> >
> > What I’m trying to do is to model a linear trend over the years (Years = Year – 1992, 1993 was the starting year of the 29 year data series) of bird numbers (“Total”) with a var/mean ratio of 2.85. The variables “Dag” (= day of the study period starting on April 1 = 1, in this subset ranging from 31 to 80) and “Dec_Hour” (= decimal hour between 4 AM and 9 PM) are nested random effects. The subset for this analysis (Trend1) contains 993 rows (=counting trips).
> >
> > The mod5<-glmer.nb(Total~  Years + (1|Dag/Dec_Hour),data=Trend1) command under lme4 gives me the following warning message:
> >
> > In theta.ml(Y, mu, weights = object using resp$weights, limit = limit,  :
> >   iteration limit reached
> >
> > I presume I need to provide glmer.nb with a lmerControl string, but have no clue what it should contain.
> >
> > I guess that the Dec_Hour may be causing the iteration limit problem. The model seems to converge, though, and the summary looks like this
> >
> >
> > Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [
> > glmerMod]
> >  Family: Negative Binomial(798.6507)  ( log )
> > Formula: Total ~ Years + (1 | Dag/Dec_Hour)
> >    Data: Trend1
> >
> >      AIC      BIC   logLik deviance df.resid
> >   4921.7   4946.2  -2455.8   4911.7      988
> >
> > Scaled residuals:
> >     Min      1Q  Median      3Q     Max
> > -1.9784 -0.6199 -0.0885  0.4511  3.8851
> >
> > Random effects:
> >  Groups       Name        Variance Std.Dev.
> >  Dec_Hour:Dag (Intercept) 0.15182  0.3896
> >  Dag          (Intercept) 0.09292  0.3048
> > Number of obs: 993, groups:  Dec_Hour:Dag, 945; Dag, 50
> >
> > Fixed effects:
> >             Estimate Std. Error z value Pr(>|z|)
> > (Intercept)  2.09539    0.06025   34.78   <2e-16 ***
> > Years       -0.04078    0.00248  -16.44   <2e-16 ***
> > ---
> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Correlation of Fixed Effects:
> >       (Intr)
> > Years -0.594
> >
> > This makes sense to me, but I would like to predict the 1993 and 2021 endpoint "Total" values of the yearly trend (numerically Years = 1 and 29) given the peak values of observed birds found to be Dag=34 and Dec_Hour=7. For this I tried:
> >
> > newd<-data.frame(Years=c(1,29),Dag=34,Dec_Hour=7.0)
> > pred<-predict(mod5,newd)
> >
> > This rendered the message:
> > Error in levelfun(r, n, allow.new.levels = allow.new.levels) :
> >   new levels detected in newdata
> >
> > Could anyone give me a hint on (a) how to avoid the iteration problem and (b) how to adjust the predict function? Thanks in advance for your help.
> >
> > Cheers,
> > Adjan
> >
> > Adriaan "Adjan" de Jong
> > Associate professor
> > Swedish University of Agricultural Sciences
> > ---
> > När du skickar e-post till SLU så innebär detta att SLU behandlar dina personuppgifter. För att läsa mer om hur detta går till, klicka här <https://www.slu.se/om-slu/kontakta-slu/personuppgifter/>
> > E-mailing SLU will result in SLU processing your personal data. For more information on how this is done, click here <https://www.slu.se/en/about-slu/contact-slu/personal-data/>
> > _______________________________________________
> > R-sig-mixed-models using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models