[R-sig-ME] Convergence Problems with glmer.nb model

Thierry Onkelinx thierry.onkelinx at inbo.be
Mon May 2 13:05:25 CEST 2016


Dear Aoibheann,

It looks like the problem is within glmer.nb. I've fit the same model with
glmer.nb, glmmadmb and inla. glmmadmb and inla give comparable estimates.
The standard errors of glmer.nb are problematic.

Maybe Ben Bolker can diagnose what is going wrong with glmer.nb.

library(readr)
library(dplyr)
dataset <- read_csv("foraging subset.csv") %>%
  mutate(
    name = factor(name),
    fieldid = factor(rank(origarea)),
    logarea = log(origarea)
  )

library(lme4)
m.lme4 <- glmer.nb(field_count ~ logarea + habitat + (1 | name) + (1 |
fieldid), data = dataset)

library(glmmADMB)
m.admb <- glmmadmb(field_count ~ logarea + habitat + (1 | name) + (1 |
fieldid), data = dataset, family = "nbinom")


# install.packages("INLA", repos="https://www.math.ntnu.no/inla/R/stable")
library(INLA)
m.inla <- inla(field_count ~ f(name, model = "iid") + f(fieldid, model =
"iid") + logarea + habitat, data = dataset, family = "nbinomial")

summary(m.lme4)$coefficients
summary(m.admb)$coefficients
m.inla$summary.fixed

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-04-29 16:55 GMT+02:00 Aoibheann Gaughran <gaughra op tcd.ie>:

> Hi Thierry,
>
> Apologies in the delay in reverting, I have been out on fieldwork. I have
> tried your suggestions, but unfortunately am still having convergence
> issues.  The most simple Poisson model runs fine, so I tried a negative
> binomial distribution but that produced about 12 convergence warnings.
>
> > modelA <- glmer.nb(field_count ~ (1|animal) + (1|field_id) + offset(log(origarea)), family = poisson, data = dframe2)
>
>  There were 12 warnings (use warnings() to see them)
>
>  > warnings()
>
> Warning messages:
>
> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  ... :
>   Model failed to converge with max|grad| = 0.0493534 (tol = 0.001, component 1)
>
> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  ... :
>   Model failed to converge with max|grad| = 0.0627833 (tol = 0.001, component 1)
>
> 3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  ... :
>   Model is nearly unidentifiable: very large eigenvalue - Rescale variables?
>
> etc.
>
>
> I reverted to the simple Poisson and added the OLR, which ran, but as soon
> as I add any additional complexity/variables (apart from sex) it produces a
> convergence warning.
>
> Warning message:In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 0.0332321 (tol = 0.001, component 1)
>
>
> I am not sure where to go from here. Any guidance would be much
> appreciated.
>
> Kind regards,
>
> Aoibheann
>
>
>
> On 25 April 2016 at 14:44, Aoibheann Gaughran <gaughra op tcd.ie> wrote:
>
>> Hi Thierry,
>>
>> Here is the dropbox link to the data -
>> https://www.dropbox.com/s/ne5d4zp2gncwylm/foraging%20subset.csv?dl=0
>>
>> I had changed the field area units from meter square to hectures already,
>> so that *should *be okay.  There is one exceedingly large "field" which
>> is actually a large area of forestry. Perhaps this is throwing the scaling
>> off.
>>
>> Can you confirm that this is how I specify the observation level random
>> effect:
>>
>> dframe1$obs <- factor(seq(nrow(dframe1)))  #modified from https://rpubs.com/bbolker/glmmchapter
>>
>> which is included in the model as
>>
>> +(1|obs)
>>
>>
>> I'll try your suggestions and see how I get on.
>>
>> Many thanks,
>>
>> Aoibheann
>>
>> On 25 April 2016 at 13:40, Thierry Onkelinx <thierry.onkelinx op inbo.be>
>> wrote:
>>
>>> Dear Aoibheann,
>>>
>>> Two general suggestions on the design. 1) A random effect of field seems
>>> relevant too. 2) Have the units of origarea in a relevant scale. You are
>>> modelling the number of visits per unit of origarea. Then the number per
>>> hectare seems more relevant to me than the number per square meter.
>>>
>>> Then try the most simple Poisson model to see it that converge. glmer(field_count
>>> ~ (1| animal) + (1|field) + offset(log(origarea)), family = poisson)
>>>
>>> If that works, then you could try the negative binomial distribution or
>>> adding an observation level random effect.
>>>
>>> The mailing list strips most attachments. So you need to put them on a
>>> website or post a dropbox or google drive link.
>>>
>>> Best regards,
>>>
>>>
>>> ir. Thierry Onkelinx
>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>>> and Forest
>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>>> Kliniekstraat 25
>>> 1070 Anderlecht
>>> Belgium
>>>
>>> To call in the statistician after the experiment is done may be no more
>>> than asking him to perform a post-mortem examination: he may be able to say
>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> The plural of anecdote is not data. ~ Roger Brinner
>>> The combination of some data and an aching desire for an answer does not
>>> ensure that a reasonable answer can be extracted from a given body of data.
>>> ~ John Tukey
>>>
>>> 2016-04-25 14:08 GMT+02:00 Aoibheann Gaughran <gaughra op tcd.ie>:
>>>
>>>> On 25 April 2016 at 13:00, Aoibheann Gaughran <gaughra op tcd.ie> wrote:
>>>>
>>>> > Good morning,
>>>> >
>>>> > First time posting so I hope I am including all of the relevant
>>>> > information.
>>>> >
>>>> > I am attempting to analyse the foraging behaviour of a animal in an
>>>> > agricultural landscape. The objective is to identify the factors
>>>> (habitat
>>>> > type, environmental variables and animal-specific variables) that best
>>>> > predict foraging site preference. Some fields are preferred while
>>>> others
>>>> > are avoided.
>>>> >
>>>> > The response variable is count data - the number of times a given
>>>> animal
>>>> > was in a given field in a given month. An animal's home range varies
>>>> from
>>>> > month to month, so the area available to it and the fields that fall
>>>> within
>>>> > its home range change somewhat every month. The count data shows an
>>>> > overdispersed, negative binomial distribution, and is zero inflated as
>>>> > fields that fell within the home range where the animal had *not
>>>> *foraged
>>>> > in that month are also included in the dataset. The individual animal
>>>> is
>>>> > specified as a random variable to account for pseudoreplication.
>>>> >
>>>> > It should be noted that at the moment I am attempting to run a the
>>>> model
>>>> > on a subset of the data (n=671) as I had attempted to run the model
>>>> on the
>>>> > full dataset (n=62,000) but three days later the model (which included
>>>> > interaction terms at this point) had still failed to run, and when
>>>> stopped,
>>>> > R gave me a multitude of convergence warning messages e.g.
>>>> >
>>>> > 13: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int
>>>> (Inf,
>>>> > ... :
>>>> >   failure to converge in 10000 evaluations
>>>> >
>>>> > Simpler iterations of the model, with fewer explanatory terms, and no
>>>> > interaction terms, also gave me convergence and some scaling warnings,
>>>> > which I sought to address using:
>>>> >
>>>> > control=glmerControl(optCtrl=list(maxfun=20000)
>>>> >
>>>> > and by scaling the numeric variables age, slope and aspect as
>>>> follows:-
>>>> >
>>>> > dframe1$agescale <- scale(dframe1$age, center = TRUE, scale = FALSE)
>>>> > dframe1$slopescale <- scale(dframe1$slope, center = TRUE, scale =
>>>> FALSE)
>>>> > dframe1$aspectscale <- scale(dframe1$aspect, center = TRUE, scale =
>>>> FALSE)
>>>> >
>>>> > Currently, the model looks like this:
>>>> >
>>>> > > model1 <- glmer.nb(field_count ~ habitat +                    +
>>>> sex+                    + agescale+                    #+ mon+
>>>>       + soil+                    + slopescale+                    +
>>>> aspectscale+                    + offset(log(origarea)) #take into account
>>>> field size +                    +(1|animal),+
>>>> control=glmerControl(optCtrl=list(maxfun=20000)),+
>>>> data = dframe1)
>>>> >
>>>> > There were 24 warnings (use warnings() to see them)
>>>> > > warnings()Warning messages:
>>>> > 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
>>>> control$checkConv,  ... :
>>>> >   Model is nearly unidentifiable: very large eigenvalue
>>>> >  - Rescale variables?;Model is nearly unidentifiable: large
>>>> eigenvalue ratio
>>>> >  - Rescale variables?
>>>> > 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
>>>> control$checkConv,  ... :
>>>> >   Model failed to converge with max|grad| = 0.0134799 (tol = 0.001,
>>>> component 1)
>>>> > 3: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
>>>> control$checkConv,  ... :
>>>> >   Model failed to converge with max|grad| = 0.148644 (tol = 0.001,
>>>> component 1)
>>>> > 4: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
>>>> control$checkConv,  ... :
>>>> >   Model is nearly unidentifiable: large eigenvalue ratio
>>>> >  - Rescale variables?
>>>> >
>>>> > etc.
>>>> >
>>>> > So the model still fails to converge despite rescaling and altering
>>>> the
>>>> > number of iterations. I had also received the following error in
>>>> relation
>>>> > to month (in the reduced dataset there are only *four *months), so Ive
>>>>
>>>> > had to exclude it for the time being. I am not sure why I am getting
>>>> this
>>>> > error since the factor has four levels.
>>>> >
>>>> > Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>>>> > contrasts can be applied only to factors with 2 or more levels
>>>> >
>>>> > I do eventually want to include interaction terms as previous
>>>> analysis on
>>>> > ranging behaviour suggests there is an interaction between age and
>>>> sex.
>>>> >
>>>> > Summary of dataset attached.  Also attached is the .csv file
>>>> containing
>>>> > the reduced dataset.
>>>> >
>>>> > I have read various suggestions online and have come across the
>>>> following
>>>> > worrying line "It's perfectly possible that your data is insufficient
>>>> to
>>>> > support the complexity of the model or the model is incorrectly
>>>> constructed
>>>> > for the design of the study".
>>>> >
>>>> > I would greatly appreciate any help you could give me with
>>>> understanding
>>>> > and solving the problems I am encountering with my model.
>>>> >
>>>> > Kind regards,
>>>> >
>>>> > --
>>>> > Aoibheann Gaughran
>>>> >
>>>> > Behavioural and Evolutionary Ecology Research Group
>>>> > Zoology Building
>>>> > School of Natural Sciences
>>>> > Trinity College Dublin
>>>> > Dublin 2
>>>> > Ireland
>>>> > Phone: +353 (86) 3812615
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> Aoibheann Gaughran
>>>>
>>>> Behavioural and Evolutionary Ecology Research Group
>>>> Zoology Building
>>>> School of Natural Sciences
>>>> Trinity College Dublin
>>>> Dublin 2
>>>> Ireland
>>>> Phone: +353 (86) 3812615
>>>> _______________________________________________
>>>> R-sig-mixed-models op r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>
>>>
>>
>>
>> --
>> Aoibheann Gaughran
>>
>> Behavioural and Evolutionary Ecology Research Group
>> Zoology Building
>> School of Natural Sciences
>> Trinity College Dublin
>> Dublin 2
>> Ireland
>> Phone: +353 (86) 3812615
>>
>
>
>
> --
> Aoibheann Gaughran
>
> Behavioural and Evolutionary Ecology Research Group
> Zoology Building
> School of Natural Sciences
> Trinity College Dublin
> Dublin 2
> Ireland
> Phone: +353 (86) 3812615
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list