[R-sig-ME] Grouping variables technically suitable for modeling

Tue Nov 9 18:30:49 CET 2021

Dear Thierry,

That "As ID2 defines almost the same grouping as ID1, it doesn't make
sense to include both of them in the model." makes good sense.

Thanks!
Tim M

On Tue, Nov 9, 2021 at 9:03 AM Thierry Onkelinx
<thierry.onkelinx using inbo.be> wrote:
>
> Dear Timothy,
>
> I would expect in your example that the combined effect of ID1 and ID2 will be more or less equally split over ID1 and ID2. As this would yield a lower penalty then attributing the effect fully to either ID1 or ID2. Hence the random effect variances of 1|ID1/ID2 will be a lot smaller than 1|ID1 or 1|ID2.
>
> As ID2 defines almost the same grouping as ID1, it doesn't make sense to include both of them in the model.
>
> I have no reference at hand for this. Just common sense.
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx using inbo.be
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be
>
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
> ///////////////////////////////////////////////////////////////////////////////////////////
>
>
>
>
> Op di 9 nov. 2021 om 15:19 schreef Timothy MacKenzie <fswfswt using gmail.com>:
>>
>> Dear Ben,
>>
>> Thank you for sharing the references regarding my first question.
>>
>> Regarding my second question, I simply mean if we have say ID1 and ID2,
>> then for ID2 to be distinguishably nested in ID1, it needs to have a
>> different unique categories relative to those of ID1.
>>
>> For example, if ID1 has 120 unique categories and ID2 has 130
>> unique categories nested in ID1, then the variance components for ID1 and
>> ID2 are not distinguishable from each other. As a result, only one of them
>> can be added as a random effect; either (1 | ID1) or (1 | |ID2), but not (1
>> | ID1/ID2).
>>
>> Is this correct and is there a published reference confirming or
>> disconfirming this?
>>
>> Thanks,
>> Tim M
>>
>> On Mon, Nov 8, 2021 at 7:35 PM Ben Bolker <bbolker using gmail.com> wrote:
>>
>> >
>> >     This is a bit of a "how long is a piece of string" question ...
>> >
>> >
>> >    The "5-6 levels of a grouping variable" rule of thumb is quoted in
>> > various places: a variety of those references (Gelman and Hill 2006,
>> > Kéry and Royle 2015, Harrison et al 2018, Arnqvist 2020) are collected
>> > by Gomes
>> > (https://www.biorxiv.org/content/10.1101/2021.04.11.439357v2.full).
>> >
>> >    I sort of see what you mean by your second paragraph, but can you
>> > give an example?
>> >
>> >
>> > On 11/7/21 5:20 PM, Timothy MacKenzie wrote:
>> > > Dear Experts,
>> > >
>> > > Apologies if this question has come up before. But I'm looking for
>> > > published references that provide guidance on when one or more grouping
>> > > variables that theoretically need to be random factors can also
>> > > "technically" be used as random factors?
>> > >
>> > > For example, I have heard for a grouping variable to be technically taken
>> > > as a random factor, it needs to have at least 10 or so unique categories?
>> > > (Any reference to confirm or disconfirm this?)
>> > >
>> > > For example, I have heard for two grouping variables to be technically
>> > > taken as random factors, they each need to have a sufficiently different
>> > > number of unique categories relative to the other one. Otherwise, their
>> > > variance components can't be distinguished from one another and thus only
>> > > one of them can be taken as random, not both (Any reference to confirm or
>> > > disconfirm this?)
>> > >
>> > > Thanks,
>> > > Tim M
>> > >
>> > >       [[alternative HTML version deleted]]
>> > >
>> > > _______________________________________________
>> > > R-sig-mixed-models using r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> > >
>> >
>> > _______________________________________________
>> > R-sig-mixed-models using r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models