[R-sig-ME] Fixing singularity in a generalized linear mixed effect model

Tue Mar 26 15:37:02 CET 2019

Dear René and Thierry

Thank you very much for your answer.
I have to check about sharing the data.

In the mean time, I'll explain a bit more the experiment.
Lance means "set" and is the code given to a fishing set.
In the experiment, within each fishing trip there are multiple fishing sets
(i.e the fishermen soak their nets multiple times, each time on a different
day, usually consecutive days) and each fishing set consists of a pair of
nets: one control net and one experimental net. This is why I believe I
must include "Lance" as a random effect, because it is really important for
the design to have a paired experiment.
So an example from my dataset would be

Trip.Code      Lance.N      Observer.Name  start date      Effort    CE
Turtles.TOT
EC062 159       Alexis Lopez 1/7/2015 0.443103     C 0
EC062 159       Alexis Lopez 1/7/2015 0.398793     E 0
EC062 160       Alexis Lopez 1/8/2015 0.474345     C 0
EC062 160       Alexis Lopez 1/8/2015 0.426911      E 0
What confuses me is that, even if I leave "Lance" as the only random
effect, the model is still singular.
Does this happen because the number of levels for Lance is too high
compared to the number of observations?
Would it be better to have repeated Lance (set number) for each trip, i.e.
trip EC062 set 1,2,3 etc then trip EC063 set 1,2,3 etc...?

Thanks again for your help!

Alessandra

On Tue, Mar 26, 2019 at 9:08 AM René <bimonosom using gmail.com> wrote:

> Hi Allessandra,
>
> your model output says:
> "Number of obs: 292"
> and your model has 2 fixed effects and 5 (!) random effects.
>  - If -  all these random effects are fully crossed. Then assuming you
> have 19 participants (e.g. 1|observers), and 4 random effects crossed on
> them with two levels each (2*2*2*2 = 16 cells), would make about 292
> observations.
> Now you see this math uncovers the most likely problem: A random effect
> (intercept) factor is urgently recommended to have least! 6 levels to make
> such a model meaningful).
> If all these random effects are not fully crossed, then the model is
> misspecified, i.e. defining random intercepts for factor 1 separately from
> random intercepts for another factor 2, when factor 1 is nested in factor
> 2, is over-identifying the randomness in your model -> singular.
>
> So,
> only define random intercepts for factors with more then 6 levels (move
> those factors with less levels to the fixed effects instead to control for
> them)
> only define separate random intercepts for factors that are crossed; for
> instance the factors boat name and lance seem suspicious. I guess, there is
> a world in which a 'lance 1' can only be on boat 'atlantis' to be used for
> fishing, but not on boat 'moby dick'. In this case, having a random
> intercept for "boat name" in addition to 'lance' would not add anything to
> the model, since lances would already cover the variance of boats (lances
> nested in boats).
> Get more observations
> Rerun the model
> Should be fine :)
>
> If singularities still occur use Bayesian models or come back here :))
>
> Best, René
>
>
> Am Di., 26. März 2019 um 09:30 Uhr schrieb Thierry Onkelinx via
> R-sig-mixed-models <r-sig-mixed-models using r-project.org>:
>
>> Dear Alessandra,
>>
>> Your problem is hard to diagnose without the data. Can you make the data
>> available? Does the combination of factors lead to unique observations? Or
>> do some combinations have only zero's?
>>
>> Best regards,
>>
>> ir. Thierry Onkelinx
>> Statisticus / Statistician
>>
>> Vlaamse Overheid / Government of Flanders
>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
>> FOREST
>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>> thierry.onkelinx using inbo.be
>> Havenlaan 88 bus 73, 1000 Brussel
>> www.inbo.be
>>
>>
>> ///////////////////////////////////////////////////////////////////////////////////////////
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to
>> say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of
>> data.
>> ~ John Tukey
>>
>> ///////////////////////////////////////////////////////////////////////////////////////////
>>
>> <https://www.inbo.be>
>>
>>
>> Op wo 20 mrt. 2019 om 23:19 schreef Alessandra Bielli <
>> bielli.alessandra using gmail.com>:
>>
>> > Dear List
>> >
>> > I am fitting this model using the lme4 package, in order to obtain catch
>> > estimates using the predict function
>> >
>> > m1 <- glmer(Count ~ CE + offset(log(Effort)) + (1|SetYear) +(1|Season) +
>> >                   (1|Lance.N) + (1|Boat.Name) + (1|Observer.Name), data
>> =
>> > Data,                     glmerControl(optimizer = "bobyqa"), family=
>> > "poisson")
>> >
>> >
>> > where: CE is a categorical (control or treatment), Effort is numerical
>> > (fishing effort), and all the other variables are random effects.
>> >
>> > *My problem is that I get a warning message saying that the model is
>> > singular*
>> >
>> > *>summary(m1)*
>> >
>> > Generalized linear mixed model fit by maximum likelihood (Laplace
>> > Approximation) [glmerMod]
>> >  Family: poisson  ( log )
>> > Formula: Count ~ CE + offset(log(Effort)) + (1 | SetYear) + (1 |
>> >     Season) + (1 | Lance.N) + (1 | Boat.Name) + (1 | Observer.Name)
>> >    Data: Data
>> > Control: glmerControl(optimizer = "bobyqa")
>> >
>> >      AIC      BIC   logLik deviance df.resid
>> >    148.6    174.3    -67.3    134.6      285
>> >
>> > Scaled residuals:
>> >     Min      1Q  Median      3Q     Max
>> > -0.4852 -0.1758 -0.1339 -0.1227  3.5980
>> >
>> > Random effects:
>> >  Groups        Name        Variance  Std.Dev.
>> >  Lance.N       (Intercept) 2.259e+00 1.503e+00
>> >  Boat.Name     (Intercept) 0.000e+00 0.000e+00
>> >  Observer.Name (Intercept) 0.000e+00 0.000e+00
>> >  Season        (Intercept) 4.149e-17 6.442e-09
>> >  SetYear       (Intercept) 0.000e+00 0.000e+00
>> > Number of obs: 292, groups:
>> > Lance.N, 146; Boat.Name, 21; Observer.Name, 5; Season, 4; SetYear, 4
>> >
>> > Fixed effects:
>> >             Estimate Std. Error z value Pr(>|z|)
>> > (Intercept)  -2.5751     0.6612  -3.895 9.83e-05 ***
>> > CEE          -0.5878     0.5003  -1.175     0.24
>> > ---
>> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> >
>> > Correlation of Fixed Effects:
>> >     (Intr)
>> > CEE -0.257
>> > *convergence code: 0*
>> > *singular fit*
>> >
>> > I am aware that there are a lot of random effects and some of them have
>> a
>> > number of levels <5. However, this study was carried out under real
>> fishery
>> > conditions, so these random effects seemed all important to me.
>> >
>> > I removed the random effects with variance zero as suggested here
>> >
>> >
>> https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#singular-models-random-effect-variances-estimated-as-zero-or-correlations-estimated-as---1
>> > until I removed them all and found myself with a glm instead.
>> >
>> > My questions are
>> >
>> > - why the variance of Lance.N, initially positive, becomes zero after I
>> > remove the other random effects that had variance equal zero?
>> > - is it acceptable to fit a glm just because all the random effect
>> > variances were zero?
>> >
>> > I hope I gave all the information you need.
>> >
>> > Thanks for any advice!
>> >
>> > Alessandra
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > R-sig-mixed-models using r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>

	[[alternative HTML version deleted]]