[R-sig-ME] Convergence Problems with glmer.nb model

Mon Apr 25 18:04:02 CEST 2016

Dear Aoibheann,

The specification of the observation level random effect is correct.

Looking at your data, I would expect that the large range in area of the
fields is the culprit. They range from only 0.07 ha up to 37.12 ha. Note
that one visit in the smallest fields equals a density of 14.5 visits/ha
while one visit in the largest 0.027 visits/ha.

I recommend to split the large field into smaller chunks. Creating smaller
chunks would make variables like slope and aspect more meaningful. I doubt
that they are homogeneous over large areas.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-04-25 15:44 GMT+02:00 Aoibheann Gaughran <gaughra op tcd.ie>:

> Hi Thierry,
>
> Here is the dropbox link to the data -
> https://www.dropbox.com/s/ne5d4zp2gncwylm/foraging%20subset.csv?dl=0
>
> I had changed the field area units from meter square to hectures already,
> so that *should *be okay.  There is one exceedingly large "field" which
> is actually a large area of forestry. Perhaps this is throwing the scaling
> off.
>
> Can you confirm that this is how I specify the observation level random
> effect:
>
> dframe1$obs <- factor(seq(nrow(dframe1)))  #modified from https://rpubs.com/bbolker/glmmchapter
>
> which is included in the model as
>
> +(1|obs)
>
>
> I'll try your suggestions and see how I get on.
>
> Many thanks,
>
> Aoibheann
>
> On 25 April 2016 at 13:40, Thierry Onkelinx <thierry.onkelinx op inbo.be>
> wrote:
>
>> Dear Aoibheann,
>>
>> Two general suggestions on the design. 1) A random effect of field seems
>> relevant too. 2) Have the units of origarea in a relevant scale. You are
>> modelling the number of visits per unit of origarea. Then the number per
>> hectare seems more relevant to me than the number per square meter.
>>
>> Then try the most simple Poisson model to see it that converge. glmer(field_count
>> ~ (1| animal) + (1|field) + offset(log(origarea)), family = poisson)
>>
>> If that works, then you could try the negative binomial distribution or
>> adding an observation level random effect.
>>
>> The mailing list strips most attachments. So you need to put them on a
>> website or post a dropbox or google drive link.
>>
>> Best regards,
>>
>>
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25
>> 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>>
>> 2016-04-25 14:08 GMT+02:00 Aoibheann Gaughran <gaughra op tcd.ie>:
>>
>>> On 25 April 2016 at 13:00, Aoibheann Gaughran <gaughra op tcd.ie> wrote:
>>>
>>> > Good morning,
>>> >
>>> > First time posting so I hope I am including all of the relevant
>>> > information.
>>> >
>>> > I am attempting to analyse the foraging behaviour of a animal in an
>>> > agricultural landscape. The objective is to identify the factors
>>> (habitat
>>> > type, environmental variables and animal-specific variables) that best
>>> > predict foraging site preference. Some fields are preferred while
>>> others
>>> > are avoided.
>>> >
>>> > The response variable is count data - the number of times a given
>>> animal
>>> > was in a given field in a given month. An animal's home range varies
>>> from
>>> > month to month, so the area available to it and the fields that fall
>>> within
>>> > its home range change somewhat every month. The count data shows an
>>> > overdispersed, negative binomial distribution, and is zero inflated as
>>> > fields that fell within the home range where the animal had *not
>>> *foraged
>>> > in that month are also included in the dataset. The individual animal
>>> is
>>> > specified as a random variable to account for pseudoreplication.
>>> >
>>> > It should be noted that at the moment I am attempting to run a the
>>> model
>>> > on a subset of the data (n=671) as I had attempted to run the model on
>>> the
>>> > full dataset (n=62,000) but three days later the model (which included
>>> > interaction terms at this point) had still failed to run, and when
>>> stopped,
>>> > R gave me a multitude of convergence warning messages e.g.
>>> >
>>> > 13: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int
>>> (Inf,
>>> > ... :
>>> >   failure to converge in 10000 evaluations
>>> >
>>> > Simpler iterations of the model, with fewer explanatory terms, and no
>>> > interaction terms, also gave me convergence and some scaling warnings,
>>> > which I sought to address using:
>>> >
>>> > control=glmerControl(optCtrl=list(maxfun=20000)
>>> >
>>> > and by scaling the numeric variables age, slope and aspect as follows:-
>>> >
>>> > dframe1$agescale <- scale(dframe1$age, center = TRUE, scale = FALSE)
>>> > dframe1$slopescale <- scale(dframe1$slope, center = TRUE, scale =
>>> FALSE)
>>> > dframe1$aspectscale <- scale(dframe1$aspect, center = TRUE, scale =
>>> FALSE)
>>> >
>>> > Currently, the model looks like this:
>>> >
>>> > > model1 <- glmer.nb(field_count ~ habitat +                    +
>>> sex+                    + agescale+                    #+ mon+
>>>       + soil+                    + slopescale+                    +
>>> aspectscale+                    + offset(log(origarea)) #take into account
>>> field size +                    +(1|animal),+
>>> control=glmerControl(optCtrl=list(maxfun=20000)),+                    data
>>> = dframe1)
>>> >
>>> > There were 24 warnings (use warnings() to see them)
>>> > > warnings()Warning messages:
>>> > 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
>>> control$checkConv,  ... :
>>> >   Model is nearly unidentifiable: very large eigenvalue
>>> >  - Rescale variables?;Model is nearly unidentifiable: large eigenvalue
>>> ratio
>>> >  - Rescale variables?
>>> > 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
>>> control$checkConv,  ... :
>>> >   Model failed to converge with max|grad| = 0.0134799 (tol = 0.001,
>>> component 1)
>>> > 3: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
>>> control$checkConv,  ... :
>>> >   Model failed to converge with max|grad| = 0.148644 (tol = 0.001,
>>> component 1)
>>> > 4: In checkConv(attr(opt, "derivs"), opt$par, ctrl =
>>> control$checkConv,  ... :
>>> >   Model is nearly unidentifiable: large eigenvalue ratio
>>> >  - Rescale variables?
>>> >
>>> > etc.
>>> >
>>> > So the model still fails to converge despite rescaling and altering the
>>> > number of iterations. I had also received the following error in
>>> relation
>>> > to month (in the reduced dataset there are only *four *months), so Ive
>>>
>>> > had to exclude it for the time being. I am not sure why I am getting
>>> this
>>> > error since the factor has four levels.
>>> >
>>> > Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>>> > contrasts can be applied only to factors with 2 or more levels
>>> >
>>> > I do eventually want to include interaction terms as previous analysis
>>> on
>>> > ranging behaviour suggests there is an interaction between age and sex.
>>> >
>>> > Summary of dataset attached.  Also attached is the .csv file containing
>>> > the reduced dataset.
>>> >
>>> > I have read various suggestions online and have come across the
>>> following
>>> > worrying line "It's perfectly possible that your data is insufficient
>>> to
>>> > support the complexity of the model or the model is incorrectly
>>> constructed
>>> > for the design of the study".
>>> >
>>> > I would greatly appreciate any help you could give me with
>>> understanding
>>> > and solving the problems I am encountering with my model.
>>> >
>>> > Kind regards,
>>> >
>>> > --
>>> > Aoibheann Gaughran
>>> >
>>> > Behavioural and Evolutionary Ecology Research Group
>>> > Zoology Building
>>> > School of Natural Sciences
>>> > Trinity College Dublin
>>> > Dublin 2
>>> > Ireland
>>> > Phone: +353 (86) 3812615
>>> >
>>>
>>>
>>>
>>> --
>>> Aoibheann Gaughran
>>>
>>> Behavioural and Evolutionary Ecology Research Group
>>> Zoology Building
>>> School of Natural Sciences
>>> Trinity College Dublin
>>> Dublin 2
>>> Ireland
>>> Phone: +353 (86) 3812615
>>> _______________________________________________
>>> R-sig-mixed-models op r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>>
>
>
> --
> Aoibheann Gaughran
>
> Behavioural and Evolutionary Ecology Research Group
> Zoology Building
> School of Natural Sciences
> Trinity College Dublin
> Dublin 2
> Ireland
> Phone: +353 (86) 3812615
>

	[[alternative HTML version deleted]]