[R-sig-ME] Estimation of variance components in random- and mixed-effects models

Tue Jun 29 02:12:09 CEST 2021

   A couple of *quick* responses.

   * I don't recommend dropping non-significant predictors, this is a 
good way to overfit models.
   * Are the temperatures for your three cohorts in a strictly linear 
sequence? i.e., temperature (cohort 1) = T1, temp (2) = T1 + delta, temp 
(3) = T1 + 2*delta ? In that case, the two effects are indeed 
identical/confounded.  In principle, your original model (using cohort 
as a random effect and temperature as fixed) is the right way to handle 
this, but for the size of data set you can't really identify 
among-cohort variation beyond the effect of temperature.

   A nice way to handle this is to treat cohort as an *ordered* 
categorical fixed effect (see ?ordered), and leave out temperature (this 
is assuming that the temperatures are as suggested above).  If you do 
this (i.e. convert cohort to 'ordered' type), R will fit two parameters, 
one labeled .L and the other labeled .Q, which together explain all of 
the among-cohort variation; if you like (although it is quite a big 
assumption, and you must be explicit about it), you can ascribe the 
linear (".L") variation to temperature and the other (".Q" or quadratic) 
to non-temperature effects.  However, given your experimental design, 
the following two explanations would be *equally well supported*:

  * none of the between-cohort variation is due to temperature;
  * temperature has a quadratic effect, so all of the between-cohort 
variation is due to temperature.

  cheers
    Ben Bolker

On 6/28/21 3:20 PM, Amy Huang wrote:
> Thank you very much for your responses and references. Sorry that I missed
> mentioning a lot of information.
> 
> I am using lme4, and the fixed predictors are all numeric. Only having 3
> levels of cohorts is indeed the major issue. After removing insignificant
> predictors in the 2nd model, the only factor left is temperature: offspring
> trait ~ temperature + (1 | cohort/female/clutch), which gives the
> convergence warning.
> 
> Now I treat cohort as a fixed effect, but when I include both cohort and
> temperature as fixed effects (in the 2nd model), a warning appears:
> "fixed-effect model matrix is rank deficient so dropping 1 column /
> coefficient". When I remove cohort (2nd model), the two models become very
> similar and give similar results.
>      offspring trait ~ cohort + (1 | female/clutch)
>      offspring trait ~ temperature + (1 | female/clutch)
> But am I not introducing pseudoreplicates if I do not include cohort as a
> factor?
> 
> PS. The section "How do I compute a coefficient of determination (R2), or
> an analogue, for (G)LMMs?" in the GLMM FAQ also gives me some insight.
> However, the links provided there seem to be not working.
> 
> Best regards,
> Amy Huang
> 
> Am Mo., 28. Juni 2021 um 19:32 Uhr schrieb Ben Bolker <bbolker using gmail.com>:
> 
>>     See also:
>>
>>
>> https://stats.stackexchange.com/questions/37647/what-is-the-minimum-recommended-number-of-groups-for-a-random-effects-factor
>>
>> https://www.biorxiv.org/content/10.1101/2021.05.03.442487v2
>>
>>    (I should these links, and the blog post link, to the GLMM FAQ ...)
>>
>> On 6/28/21 1:17 PM, Thierry Onkelinx wrote:
>>> Another issue is that you have too few levels to fit "cohort" as a
>>> random effect. I wrote a blogpost on this a few years ago:
>>> https://www.muscardinus.be/2018/09/number-random-effect-levels/
>>> <https://www.muscardinus.be/2018/09/number-random-effect-levels/>
>>>
>>> Best regards,
>>>
>>> ir. Thierry Onkelinx
>>> Statisticus / Statistician
>>>
>>> Vlaamse Overheid / Government of Flanders
>>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>>> AND FOREST
>>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>>> thierry.onkelinx using inbo.be <mailto:thierry.onkelinx using inbo.be>
>>> Havenlaan 88 bus 73, 1000 Brussel
>>> www.inbo.be <http://www.inbo.be>
>>>
>>>
>> ///////////////////////////////////////////////////////////////////////////////////////////
>>> To call in the statistician after the experiment is done may be no more
>>> than asking him to perform a post-mortem examination: he may be able to
>>> say what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> The plural of anecdote is not data. ~ Roger Brinner
>>> The combination of some data and an aching desire for an answer does not
>>> ensure that a reasonable answer can be extracted from a given body of
>>> data. ~ John Tukey
>>>
>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>
>>> <https://www.inbo.be>
>>>
>>>
>>> Op ma 28 jun. 2021 om 16:31 schreef Ben Bolker <bbolker using gmail.com
>>> <mailto:bbolker using gmail.com>>:
>>>
>>>          Are you using lme4? (I'm 99% sure you are, but it's good to be
>>>      explicit.)
>>>
>>>          Are all of your fixed predictors numeric (rather than
>>>      factor/categorical) ?
>>>
>>>          Note that a convergence warning is a *warning*, not an error:
>> have
>>>      you checked the troubleshooting steps in ?lme4::convergence (in
>>>      particular, scaling and centering your predictor variables might
>>>      help ...)
>>>
>>>          cheers
>>>           Ben Bolker
>>>
>>>
>>>      On 6/28/21 10:17 AM, Amy Huang wrote:
>>>       > Dear all,
>>>       >
>>>       > I am examining maternal effects, and my data have three hierarchy
>>>      levels:
>>>       > clutches of the same female, females, and cohorts. My explanatory
>>>      variables
>>>       > are at the female level (female length, age) and at the cohort
>> level
>>>       > (temperature).
>>>       >
>>>       > I would like to estimate the variance components of each
>>>      hierarchy level
>>>       > (i.e. relative amount of variance at each level) and then to find
>>>      out which
>>>       > factors (female length, age, temperature) explain most of the
>>>      variance. For
>>>       > these, I have two models:
>>>       >      offspring trait ~ 1 + (1 | cohort/female/clutch)
>>>       >      offspring trait ~ temperature + female length + age + (1 |
>>>       > cohort/female/clutch)
>>>       >
>>>       > The major problem is that I only have 3 cohorts (and so 3
>>>      temperatures).
>>>       >  From the first model I am able to get the information, but from
>>>      the second
>>>       > one there is an error message: "Model failed to converge with 1
>>>      negative
>>>       > eigenvalue: -2.0e+01". The error pops up probably because I have
>> both
>>>       > temperature (fixed) and cohort (random) included. Is my approach
>>>      correct?
>>>       > And is there a way to fix this error?
>>>       >
>>>       > Thank you so much for your time.
>>>       >
>>>       > Best regards,
>>>       > Amy Huang
>>>       >
>>>       >       [[alternative HTML version deleted]]
>>>       >
>>>       > _______________________________________________
>>>       > R-sig-mixed-models using r-project.org
>>>      <mailto:R-sig-mixed-models using r-project.org> mailing list
>>>       > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>      <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>>       >
>>>
>>>      _______________________________________________
>>>      R-sig-mixed-models using r-project.org
>>>      <mailto:R-sig-mixed-models using r-project.org> mailing list
>>>      https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>      <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>>
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>