[R-sig-ME] Estimation of variance components in random- and mixed-effects models

kalman.toth k@|m@n@toth @end|ng |rom protonm@||@com
Sat Jul 3 16:59:52 CEST 2021


Sorry for butting into this conversation but there is something I cannot get my head around. I often encounter situation where B grouping variable is nested into A grouping variable. B has around 10-20 levels and A only has around 3-5. Conceptually, everything is a random effect. If my understanding is correct in this case I should use the following lmer model due to the limited number of levels of B:
R ~ A+(1|B)
Even though, scientifically R ~ (1|A/B) would make more sense.
Variance of B of the first term is not equal to variance of A:B in the second.

I am most interested in a reasonable estimation on the variance of B while we know that it is nested in A.

In general, we use the classical ANOVA to do that: aov(R~A/B). This gives the same variance estimate for A:B interaction as lmer(R ~ (1|A/B)) if it the design is relatively well balanced. In a way I am not surprised because the Cross Validated post suggested by Ben Bolker also mentions that in this case mixed effect model behaves similarly to classical ones. But does it mean that this  ANOVA variance estimate is also biased/unreliable? Would the variance estimation of B from the first lmer model really be a better estimation?

Best Regards,
Kalman Toth

Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Tuesday, June 29th, 2021 at 2:12 AM, Ben Bolker <bbolker using gmail.com> wrote:

> A couple of quick responses.
>
> -   I don't recommend dropping non-significant predictors, this is a
>
>     good way to overfit models.
> -   Are the temperatures for your three cohorts in a strictly linear
>
>     sequence? i.e., temperature (cohort 1) = T1, temp (2) = T1 + delta, temp
>
>     (3) = T1 + 2*delta ? In that case, the two effects are indeed
>
>     identical/confounded. In principle, your original model (using cohort
>
>     as a random effect and temperature as fixed) is the right way to handle
>
>     this, but for the size of data set you can't really identify
>
>     among-cohort variation beyond the effect of temperature.
>
>     A nice way to handle this is to treat cohort as an ordered
>
>     categorical fixed effect (see ?ordered), and leave out temperature (this
>
>     is assuming that the temperatures are as suggested above). If you do
>
>     this (i.e. convert cohort to 'ordered' type), R will fit two parameters,
>
>     one labeled .L and the other labeled .Q, which together explain all of
>
>     the among-cohort variation; if you like (although it is quite a big
>
>     assumption, and you must be explicit about it), you can ascribe the
>
>     linear (".L") variation to temperature and the other (".Q" or quadratic)
>
>     to non-temperature effects. However, given your experimental design,
>
>     the following two explanations would be equally well supported:
> -   none of the between-cohort variation is due to temperature;
> -   temperature has a quadratic effect, so all of the between-cohort
>
>     variation is due to temperature.
>
>     cheers
>
>     Ben Bolker
>
>     On 6/28/21 3:20 PM, Amy Huang wrote:
>
> > Thank you very much for your responses and references. Sorry that I missed
> >
> > mentioning a lot of information.
> >
> > I am using lme4, and the fixed predictors are all numeric. Only having 3
> >
> > levels of cohorts is indeed the major issue. After removing insignificant
> >
> > predictors in the 2nd model, the only factor left is temperature: offspring
> >
> > trait ~ temperature + (1 | cohort/female/clutch), which gives the
> >
> > convergence warning.
> >
> > Now I treat cohort as a fixed effect, but when I include both cohort and
> >
> > temperature as fixed effects (in the 2nd model), a warning appears:
> >
> > "fixed-effect model matrix is rank deficient so dropping 1 column /
> >
> > coefficient". When I remove cohort (2nd model), the two models become very
> >
> > similar and give similar results.
> >
> > offspring trait ~ cohort + (1 | female/clutch)
> >
> > offspring trait ~ temperature + (1 | female/clutch)
> >
> > But am I not introducing pseudoreplicates if I do not include cohort as a
> >
> > factor?
> >
> > PS. The section "How do I compute a coefficient of determination (R2), or
> >
> > an analogue, for (G)LMMs?" in the GLMM FAQ also gives me some insight.
> >
> > However, the links provided there seem to be not working.
> >
> > Best regards,
> >
> > Amy Huang
> >
> > Am Mo., 28. Juni 2021 um 19:32 Uhr schrieb Ben Bolker bbolker using gmail.com:
> >
> > >     See also:
> > >
> > >
> > > https://stats.stackexchange.com/questions/37647/what-is-the-minimum-recommended-number-of-groups-for-a-random-effects-factor
> > >
> > > https://www.biorxiv.org/content/10.1101/2021.05.03.442487v2
> > >
> > > (I should these links, and the blog post link, to the GLMM FAQ ...)
> > >
> > > On 6/28/21 1:17 PM, Thierry Onkelinx wrote:
> > >
> > > > Another issue is that you have too few levels to fit "cohort" as a
> > > >
> > > > random effect. I wrote a blogpost on this a few years ago:
> > > >
> > > > https://www.muscardinus.be/2018/09/number-random-effect-levels/
> > > >
> > > > https://www.muscardinus.be/2018/09/number-random-effect-levels/
> > > >
> > > > Best regards,
> > > >
> > > > ir. Thierry Onkelinx
> > > >
> > > > Statisticus / Statistician
> > > >
> > > > Vlaamse Overheid / Government of Flanders
> > > >
> > > > INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
> > > >
> > > > AND FOREST
> > > >
> > > > Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> > > >
> > > > thierry.onkelinx using inbo.be mailto:thierry.onkelinx using inbo.be
> > > >
> > > > Havenlaan 88 bus 73, 1000 Brussel
> > > >
> > > > www.inbo.be http://www.inbo.be
> > >
> > > ///////////////////////////////////////////////////////////////////////////////////////////
> > >
> > > > To call in the statistician after the experiment is done may be no more
> > > >
> > > > than asking him to perform a post-mortem examination: he may be able to
> > > >
> > > > say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > > >
> > > > The plural of anecdote is not data. ~ Roger Brinner
> > > >
> > > > The combination of some data and an aching desire for an answer does not
> > > >
> > > > ensure that a reasonable answer can be extracted from a given body of
> > > >
> > > > data. ~ John Tukey
> > >
> > > ///////////////////////////////////////////////////////////////////////////////////////////
> > >
> > > > https://www.inbo.be
> > > >
> > > > Op ma 28 jun. 2021 om 16:31 schreef Ben Bolker <bbolker using gmail.com
> > > >
> > > > mailto:bbolker using gmail.com>:
> > > >
> > > >          Are you using lme4? (I'm 99% sure you are, but it's good to be
> > > >      explicit.)
> > > >
> > > >          Are all of your fixed predictors numeric (rather than
> > > >      factor/categorical) ?
> > > >
> > > >          Note that a convergence warning is a *warning*, not an error:
> > > >
> > >
> > > have
> > >
> > > >      you checked the troubleshooting steps in ?lme4::convergence (in
> > > >      particular, scaling and centering your predictor variables might
> > > >      help ...)
> > > >
> > > >          cheers
> > > >           Ben Bolker
> > > >
> > > >
> > > >      On 6/28/21 10:17 AM, Amy Huang wrote:
> > > >       > Dear all,
> > > >       >
> > > >       > I am examining maternal effects, and my data have three hierarchy
> > > >      levels:
> > > >       > clutches of the same female, females, and cohorts. My explanatory
> > > >      variables
> > > >       > are at the female level (female length, age) and at the cohort
> > > >
> > >
> > > level
> > >
> > > >       > (temperature).
> > > >       >
> > > >       > I would like to estimate the variance components of each
> > > >      hierarchy level
> > > >       > (i.e. relative amount of variance at each level) and then to find
> > > >      out which
> > > >       > factors (female length, age, temperature) explain most of the
> > > >      variance. For
> > > >       > these, I have two models:
> > > >       >      offspring trait ~ 1 + (1 | cohort/female/clutch)
> > > >       >      offspring trait ~ temperature + female length + age + (1 |
> > > >       > cohort/female/clutch)
> > > >       >
> > > >       > The major problem is that I only have 3 cohorts (and so 3
> > > >      temperatures).
> > > >       >  From the first model I am able to get the information, but from
> > > >      the second
> > > >       > one there is an error message: "Model failed to converge with 1
> > > >      negative
> > > >       > eigenvalue: -2.0e+01". The error pops up probably because I have
> > > >
> > >
> > > both
> > >
> > > >       > temperature (fixed) and cohort (random) included. Is my approach
> > > >      correct?
> > > >       > And is there a way to fix this error?
> > > >       >
> > > >       > Thank you so much for your time.
> > > >       >
> > > >       > Best regards,
> > > >       > Amy Huang
> > > >       >
> > > >       >       [[alternative HTML version deleted]]
> > > >       >
> > > >       > _______________________________________________
> > > >       > R-sig-mixed-models using r-project.org
> > > >      <mailto:R-sig-mixed-models using r-project.org> mailing list
> > > >       > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > >      <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > > >       >
> > > >
> > > >      _______________________________________________
> > > >      R-sig-mixed-models using r-project.org
> > > >      <mailto:R-sig-mixed-models using r-project.org> mailing list
> > > >      https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > >      <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > > >
> > >
> > > R-sig-mixed-models using r-project.org mailing list
> > >
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> > [[alternative HTML version deleted]]
> >
> > R-sig-mixed-models using r-project.org mailing list
> >
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> R-sig-mixed-models using r-project.org mailing list
>
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list