[R-sig-ME] defining nested and crossed random effects in glmer

Thierry Onkelinx th|erry@onke||nx @end|ng |rom |nbo@be
Tue Jul 12 10:03:45 CEST 2022

Dear Anu,

In case every species has a (near) unique value for trait, then trait as
factor becomes just another "label" for the species. trait + (1|species)
would be the same a species + (1|species), which clearly doesn't
make sense. trait + (1|species) makes only sense when the trait is

Your model seems to contain a lot of parameters  and the number of
observations seems limited. I estimate your model has about 18 parameters.
As a rule of thumb, you need ten times that number of "effective
observations". In case of a Bernouilli response the number would be the
total of 0 or 1, whichever is the lowest. E.g. with 25% "1" you need 180
times "1" or 720 observations in total.
You might need to collect more data or simplify the model.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey


Op ma 11 jul. 2022 om 11:36 schreef Eskelinen, Anu <anu.eskelinen using idiv.de>:

> Hello,
> I have a question related to defining a random formula in a mixed effects
> model in glmer. I would be grateful for any help from this expert
> community. Thanks a lot already in advance!
> My question relates to ecological plant species data from a full-factorial
> experiment with four treatments. In the experiment, treatment 1 is nested
> within treatments 2 and 3, and treatments 2 and 3 are nested within
> treatment 4. All treatments have two levels (manipulation, no
> manipulation). Treatment 1 subplots (manipulation, no manipulation) are
> next to each other (paired), forming bigger plots that receive a
> combination of treatments 2 and 3. These plots are arranged in groups of
> four (called blocks) that receive treatment 4. Pretty complex experiment,
> yes, as many ecological questions are complex.
> My question concerns the following analysis. We have calculated a binary
> 0-1 response variable between the subplots for plant species that occur in
> the subplots. We call this “species responsiveness to treatment 1”. We then
> explain this “species responsiveness to treatment 1” by treatments 2, 3 and
> 4, and by an additional continuous variable which is a trait that describes
> some characteristics of the species. The main question is to assess whether
> species’ traits affect their responsiveness to treatment 1, and whether
> species traits and other treatments interact to affect their responsiveness
> to treatment 1.
> I have used this model in R to examine the above question:
> model <- glmer(species_responsiveness ~
> (treatment4+treatment3+treatment2+trait)^3 + (1|block/plot),
> family=binomial, data=traits)
> As far as I understand, this random structure should take into account
> that 1) species observations are nested within plots, and 2) plots are
> nested within blocks. There are multiple species occurring in the same
> plots, that’s why species is nested within plots, and most species also
> occur in several plots but not all species occur in all plots. There are
> many species (~40) in the data.
> My question is that does the species identity also need to be a crossed
> random factor in the model? Like this:
> +(1|block/plot) + (1|species)
> With this random formula, the model would be:
> model <- glmer(species_responsiveness ~
> (treatment1+treatment2+treatment3+trait)^3 + (1|block/plot) + (1|species),
> family=binomial, data=traits)
> However, with this random structure I get a lot of complaints and
> sometimes the models do not converge at all. Models without species
> converge normally. Scaling the trait values does not help. I have several
> traits for the species, each analyzed in its own model, and they are all
> performing equally badly. I get, for example, these error messages:
> Warning messages:
> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model failed to converge with max|grad| = 0.0841993 (tol = 0.002,
> component 1)
> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>   Model is nearly unidentifiable: very large eigenvalue
>  - Rescale variables?
> Or
> Model failed to converge with max|grad| = 0.0841993 (tol = 0.002,
> component 1)
> Model is nearly unidentifiable: very large eigenvalue
>  - Rescale variables?
> In the data, “species” overlaps/aligns perfectly with “trait” as each
> individual species has only one unique value for each trait. Here’s an
> example how it can look in the data:
> species      trait
> Antodo       0.2
> Antodo       0.2
> Antodo       0.2
> Helpra       0.5
> Helpra       0.5
> Helpra       0.5
> Seqvar       0.03
> Seqvar       0.03
> Seqvar       0.03
> So, when I have “trait” as a fixed factor and “species” as a random
> factor, do they compete to explain the same variation in the data? And can
> that be causing problems in model convergence? What kind of random formula
> would be correct? I would be super grateful for any advice/thoughts. Thanks
> so much!
> Best regards,
> Anu
> Assoc. Prof. Anu Eskelinen
> Oulu University, Finland
> iDiv and UFZ, Leipzig, Germany
>         [[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list