[R-sig-ME] Estimation of variance components in random- and mixed-effects models

Sun Jul 4 00:39:24 CEST 2021

 Thank you very much for your valuable insights and suggestions, Thierry
and Dr. Bolker! They have brought me to relearning and expanding my
statistical knowledge.

Treating the predictor "cohort" as an ordered factor is actually most
suitable for my original experimental design. I study cohort effects with
the levels "early, middle and late". The in-situ temperatures (T) I
measured by each level were in a nearly linear sequence "by chance": T2 =
T1 + delta, T3 = T1 + 1.7 * delta.

Because of this, linking cohort effect to temperature would probably be
more difficult. With the new model
    offspring trait ~ cohort +female length + (1 | female/clutch),
the only significant fixed predictor is cohort.L.

It suggests the levels "early, middle and late" are best interpreted as
having nearly equal-spaced intervals (similar to the temperatures). But it
does not support directly that the variance among cohorts is due to
temperature. So the best I could do is to show the correlation between
cohorts and temperatures and propose the link (?)

(Thank you again for the warning about model selection, I am carefully
revising and relearning the process.)

(I tried to use a subset of the data to compare the variance estimates by
classical ANOVA and mixed-effects model, but it is unfortunately too much
for my brain right now..)

Best regards,
Amy Huang

Am Sa., 3. Juli 2021 um 17:00 Uhr schrieb kalman.toth via
R-sig-mixed-models <r-sig-mixed-models using r-project.org>:

> Sorry for butting into this conversation but there is something I cannot
> get my head around. I often encounter situation where B grouping variable
> is nested into A grouping variable. B has around 10-20 levels and A only
> has around 3-5. Conceptually, everything is a random effect. If my
> understanding is correct in this case I should use the following lmer model
> due to the limited number of levels of B:
> R ~ A+(1|B)
> Even though, scientifically R ~ (1|A/B) would make more sense.
> Variance of B of the first term is not equal to variance of A:B in the
> second.
>
> I am most interested in a reasonable estimation on the variance of B while
> we know that it is nested in A.
>
> In general, we use the classical ANOVA to do that: aov(R~A/B). This gives
> the same variance estimate for A:B interaction as lmer(R ~ (1|A/B)) if it
> the design is relatively well balanced. In a way I am not surprised because
> the Cross Validated post suggested by Ben Bolker also mentions that in this
> case mixed effect model behaves similarly to classical ones. But does it
> mean that this  ANOVA variance estimate is also biased/unreliable? Would
> the variance estimation of B from the first lmer model really be a better
> estimation?
>
> Best Regards,
> Kalman Toth
>
> Sent with ProtonMail Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>
> On Tuesday, June 29th, 2021 at 2:12 AM, Ben Bolker <bbolker using gmail.com>
> wrote:
>
> > A couple of quick responses.
> >
> > -   I don't recommend dropping non-significant predictors, this is a
> >
> >     good way to overfit models.
> > -   Are the temperatures for your three cohorts in a strictly linear
> >
> >     sequence? i.e., temperature (cohort 1) = T1, temp (2) = T1 + delta,
> temp
> >
> >     (3) = T1 + 2*delta ? In that case, the two effects are indeed
> >
> >     identical/confounded. In principle, your original model (using cohort
> >
> >     as a random effect and temperature as fixed) is the right way to
> handle
> >
> >     this, but for the size of data set you can't really identify
> >
> >     among-cohort variation beyond the effect of temperature.
> >
> >     A nice way to handle this is to treat cohort as an ordered
> >
> >     categorical fixed effect (see ?ordered), and leave out temperature
> (this
> >
> >     is assuming that the temperatures are as suggested above). If you do
> >
> >     this (i.e. convert cohort to 'ordered' type), R will fit two
> parameters,
> >
> >     one labeled .L and the other labeled .Q, which together explain all
> of
> >
> >     the among-cohort variation; if you like (although it is quite a big
> >
> >     assumption, and you must be explicit about it), you can ascribe the
> >
> >     linear (".L") variation to temperature and the other (".Q" or
> quadratic)
> >
> >     to non-temperature effects. However, given your experimental design,
> >
> >     the following two explanations would be equally well supported:
> > -   none of the between-cohort variation is due to temperature;
> > -   temperature has a quadratic effect, so all of the between-cohort
> >
> >     variation is due to temperature.
> >
> >     cheers
> >
> >     Ben Bolker
> >
> >     On 6/28/21 3:20 PM, Amy Huang wrote:
> >
> > > Thank you very much for your responses and references. Sorry that I
> missed
> > >
> > > mentioning a lot of information.
> > >
> > > I am using lme4, and the fixed predictors are all numeric. Only having
> 3
> > >
> > > levels of cohorts is indeed the major issue. After removing
> insignificant
> > >
> > > predictors in the 2nd model, the only factor left is temperature:
> offspring
> > >
> > > trait ~ temperature + (1 | cohort/female/clutch), which gives the
> > >
> > > convergence warning.
> > >
> > > Now I treat cohort as a fixed effect, but when I include both cohort
> and
> > >
> > > temperature as fixed effects (in the 2nd model), a warning appears:
> > >
> > > "fixed-effect model matrix is rank deficient so dropping 1 column /
> > >
> > > coefficient". When I remove cohort (2nd model), the two models become
> very
> > >
> > > similar and give similar results.
> > >
> > > offspring trait ~ cohort + (1 | female/clutch)
> > >
> > > offspring trait ~ temperature + (1 | female/clutch)
> > >
> > > But am I not introducing pseudoreplicates if I do not include cohort
> as a
> > >
> > > factor?
> > >
> > > PS. The section "How do I compute a coefficient of determination (R2),
> or
> > >
> > > an analogue, for (G)LMMs?" in the GLMM FAQ also gives me some insight.
> > >
> > > However, the links provided there seem to be not working.
> > >
> > > Best regards,
> > >
> > > Amy Huang
> > >
> > > Am Mo., 28. Juni 2021 um 19:32 Uhr schrieb Ben Bolker
> bbolker using gmail.com:
> > >
> > > >     See also:
> > > >
> > > >
> > > >
> https://stats.stackexchange.com/questions/37647/what-is-the-minimum-recommended-number-of-groups-for-a-random-effects-factor
> > > >
> > > > https://www.biorxiv.org/content/10.1101/2021.05.03.442487v2
> > > >
> > > > (I should these links, and the blog post link, to the GLMM FAQ ...)
> > > >
> > > > On 6/28/21 1:17 PM, Thierry Onkelinx wrote:
> > > >
> > > > > Another issue is that you have too few levels to fit "cohort" as a
> > > > >
> > > > > random effect. I wrote a blogpost on this a few years ago:
> > > > >
> > > > > https://www.muscardinus.be/2018/09/number-random-effect-levels/
> > > > >
> > > > > https://www.muscardinus.be/2018/09/number-random-effect-levels/
> > > > >
> > > > > Best regards,
> > > > >
> > > > > ir. Thierry Onkelinx
> > > > >
> > > > > Statisticus / Statistician
> > > > >
> > > > > Vlaamse Overheid / Government of Flanders
> > > > >
> > > > > INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR
> NATURE
> > > > >
> > > > > AND FOREST
> > > > >
> > > > > Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality
> Assurance
> > > > >
> > > > > thierry.onkelinx using inbo.be mailto:thierry.onkelinx using inbo.be
> > > > >
> > > > > Havenlaan 88 bus 73, 1000 Brussel
> > > > >
> > > > > www.inbo.be http://www.inbo.be
> > > >
> > > >
> ///////////////////////////////////////////////////////////////////////////////////////////
> > > >
> > > > > To call in the statistician after the experiment is done may be no
> more
> > > > >
> > > > > than asking him to perform a post-mortem examination: he may be
> able to
> > > > >
> > > > > say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > > > >
> > > > > The plural of anecdote is not data. ~ Roger Brinner
> > > > >
> > > > > The combination of some data and an aching desire for an answer
> does not
> > > > >
> > > > > ensure that a reasonable answer can be extracted from a given body
> of
> > > > >
> > > > > data. ~ John Tukey
> > > >
> > > >
> ///////////////////////////////////////////////////////////////////////////////////////////
> > > >
> > > > > https://www.inbo.be
> > > > >
> > > > > Op ma 28 jun. 2021 om 16:31 schreef Ben Bolker <bbolker using gmail.com
> > > > >
> > > > > mailto:bbolker using gmail.com>:
> > > > >
> > > > >          Are you using lme4? (I'm 99% sure you are, but it's good
> to be
> > > > >      explicit.)
> > > > >
> > > > >          Are all of your fixed predictors numeric (rather than
> > > > >      factor/categorical) ?
> > > > >
> > > > >          Note that a convergence warning is a *warning*, not an
> error:
> > > > >
> > > >
> > > > have
> > > >
> > > > >      you checked the troubleshooting steps in ?lme4::convergence
> (in
> > > > >      particular, scaling and centering your predictor variables
> might
> > > > >      help ...)
> > > > >
> > > > >          cheers
> > > > >           Ben Bolker
> > > > >
> > > > >
> > > > >      On 6/28/21 10:17 AM, Amy Huang wrote:
> > > > >       > Dear all,
> > > > >       >
> > > > >       > I am examining maternal effects, and my data have three
> hierarchy
> > > > >      levels:
> > > > >       > clutches of the same female, females, and cohorts. My
> explanatory
> > > > >      variables
> > > > >       > are at the female level (female length, age) and at the
> cohort
> > > > >
> > > >
> > > > level
> > > >
> > > > >       > (temperature).
> > > > >       >
> > > > >       > I would like to estimate the variance components of each
> > > > >      hierarchy level
> > > > >       > (i.e. relative amount of variance at each level) and then
> to find
> > > > >      out which
> > > > >       > factors (female length, age, temperature) explain most of
> the
> > > > >      variance. For
> > > > >       > these, I have two models:
> > > > >       >      offspring trait ~ 1 + (1 | cohort/female/clutch)
> > > > >       >      offspring trait ~ temperature + female length + age +
> (1 |
> > > > >       > cohort/female/clutch)
> > > > >       >
> > > > >       > The major problem is that I only have 3 cohorts (and so 3
> > > > >      temperatures).
> > > > >       >  From the first model I am able to get the information,
> but from
> > > > >      the second
> > > > >       > one there is an error message: "Model failed to converge
> with 1
> > > > >      negative
> > > > >       > eigenvalue: -2.0e+01". The error pops up probably because
> I have
> > > > >
> > > >
> > > > both
> > > >
> > > > >       > temperature (fixed) and cohort (random) included. Is my
> approach
> > > > >      correct?
> > > > >       > And is there a way to fix this error?
> > > > >       >
> > > > >       > Thank you so much for your time.
> > > > >       >
> > > > >       > Best regards,
> > > > >       > Amy Huang
> > > > >       >
> > > > >       >       [[alternative HTML version deleted]]
> > > > >       >
> > > > >       > _______________________________________________
> > > > >       > R-sig-mixed-models using r-project.org
> > > > >      <mailto:R-sig-mixed-models using r-project.org> mailing list
> > > > >       > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > > >      <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > > > >       >
> > > > >
> > > > >      _______________________________________________
> > > > >      R-sig-mixed-models using r-project.org
> > > > >      <mailto:R-sig-mixed-models using r-project.org> mailing list
> > > > >      https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > > > >      <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > > > >
> > > >
> > > > R-sig-mixed-models using r-project.org mailing list
> > > >
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > R-sig-mixed-models using r-project.org mailing list
> > >
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> > R-sig-mixed-models using r-project.org mailing list
> >
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]