[R-sig-ME] Is crossed random-effect the only choice?

Mon Jul 19 19:29:04 CEST 2021

Dear Thierry,

Thanks. Given the data structure, your previous comment (i.e., H being
implicitly nested in X) as well as your webpage, you mean (1|X/H)?

H  X
1   2
1   2
2   1
2   1
2   1
3   2
4   1

On Mon, Jul 19, 2021 at 12:19 PM Thierry Onkelinx <thierry.onkelinx using inbo.be>
wrote:

> Dear Jack,
>
> IMHO the discussion whether it is nested, partially nested, or crossed is
> pointless. Use explicit nesting by creating random effects with unique
> levels across the data. That is each level defines a unique state for that
> variable, regardless any other variables. So if you consider the formula of
> one study is the same as the formula of another study, then they get the
> same level, otherwise they get a different level.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx using inbo.be
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be
>
>
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> ///////////////////////////////////////////////////////////////////////////////////////////
>
> <https://www.inbo.be>
>
>
> Op ma 19 jul. 2021 om 16:32 schreef Jack Solomon <kj.jsolomon using gmail.com>:
>
>> Dear Thierry,
>>
>> Thank you for your interesting comment (H being nested in X). I read your
>> informative webpage as well which was in large part in line with this
>> comment: (https://stats.stackexchange.com/a/228814/140365).
>>
>> I think a little context can help. Think of H as a group of studies (each
>> with one or more rows). And think of X as scientific formulas each of which
>> a study has used (for all its rows) to measure the same construct.
>>
>> Given this context and the data below, do you think there is a "nesting"
>> or a "crossing" (full or partial) relationship between studies (H) and the
>> formulas (X) they used, why?
>>
>> Thanks, Jack
>> H  X
>> 1   2
>> 1   2
>> 2   1
>> 2   1
>> 2   1
>> 3   2
>> 4   1
>>
>> On Mon, Jul 19, 2021 at 1:58 AM Thierry Onkelinx <
>> thierry.onkelinx using inbo.be> wrote:
>>
>>> Dear Jack,
>>>
>>> In your example H is implicitly nested in X. See
>>> https://www.muscardinus.be/2017/07/lme4-random-effects/ for
>>> more information on nested vs crossed effects.
>>>
>>> Best regards,
>>>
>>> ir. Thierry Onkelinx
>>> Statisticus / Statistician
>>>
>>> Vlaamse Overheid / Government of Flanders
>>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>>> AND FOREST
>>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>>> thierry.onkelinx using inbo.be
>>> Havenlaan 88 bus 73, 1000 Brussel
>>> www.inbo.be
>>>
>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>> To call in the statistician after the experiment is done may be no more
>>> than asking him to perform a post-mortem examination: he may be able to say
>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> The plural of anecdote is not data. ~ Roger Brinner
>>> The combination of some data and an aching desire for an answer does not
>>> ensure that a reasonable answer can be extracted from a given body of data.
>>> ~ John Tukey
>>>
>>> ///////////////////////////////////////////////////////////////////////////////////////////
>>>
>>> <https://www.inbo.be>
>>>
>>>
>>> Op vr 16 jul. 2021 om 01:09 schreef Jack Solomon <kj.jsolomon using gmail.com
>>> >:
>>>
>>>> Dear Ben,
>>>>
>>>> Just to make sure, the structure of my data is below. With this data
>>>> structure, I wonder why ~ (1|H) + (1|X) would indicate that H and X are
>>>> crossed random-effects?
>>>>
>>>> Because theoretically every value of X is capable of meeting every
>>>> value of
>>>> H (Or because each value of X means the same thing across any given
>>>> value
>>>> of H)?
>>>>
>>>> Does this also mean each unique cluster (separately for H & X) is
>>>> considered correlated with another cluster?
>>>>
>>>> Thank you, Jack
>>>>
>>>> H  X
>>>> 1   2
>>>> 1   2
>>>> 2   1
>>>> 2   1
>>>> 2   1
>>>> 3   2
>>>> 4   1
>>>>
>>>> On Thu, Jul 15, 2021 at 8:46 AM Ben Bolker <bbolker using gmail.com> wrote:
>>>>
>>>> >
>>>> >
>>>> > On 7/15/21 9:44 AM, Jack Solomon wrote:
>>>> > > Dear Ben,
>>>> > >
>>>> > > In the case of #3 in your response, if the researcher intends to
>>>> > > generalize beyond the 3 levels of the categorical factor/ predictor
>>>> X,
>>>> > > then can s/he use: ~ (1|H) + (1|X)?
>>>> > >
>>>> > > If yes, then H and X will be crossed?
>>>> > >
>>>> > > Thanks,
>>>> > > Jack
>>>> >
>>>> >    Yes, and yes.
>>>> > >
>>>> > >
>>>> > > On Sat, Jul 10, 2021, 10:36 PM Jack Solomon <kj.jsolomon using gmail.com
>>>> > > <mailto:kj.jsolomon using gmail.com>> wrote:
>>>> > >
>>>> > >     Dear Ben,
>>>> > >
>>>> > >     Thank you for your informative response. I think # 4 is what
>>>> matches
>>>> > >     my situation.
>>>> > >
>>>> > >     Thanks again, Jack
>>>> > >
>>>> > >     On Sat, Jul 10, 2021 at 8:30 PM Ben Bolker <bbolker using gmail.com
>>>> > >     <mailto:bbolker using gmail.com>> wrote:
>>>> > >
>>>> > >             The "crossed vs random" terminology is only relevant in
>>>> > >         models with
>>>> > >         more than one grouping variable.  I would call (1|X) " a
>>>> random
>>>> > >         effect
>>>> > >         of X" or more precisely "a random-intercept model with
>>>> grouping
>>>> > >         variable X"
>>>> > >
>>>> > >             However, your question is a little unclear to me.  Is X
>>>> a
>>>> > >         grouping
>>>> > >         variable or a predictor variable (numeric or categorical)
>>>> that
>>>> > >         varies
>>>> > >         across groups?
>>>> > >
>>>> > >             I can think of four possibilities.
>>>> > >
>>>> > >            1. X is the grouping variable (e.g. "hospital"). Then ~
>>>> (1|X)
>>>> > >         is a
>>>> > >         model that describes variation in the model intercept /
>>>> baseline
>>>> > >         value,
>>>> > >         across hospitals.
>>>> > >
>>>> > >            2. X is a continuous covariate (e.g. annual hospital
>>>> > >         budget).  Then if
>>>> > >         H is the factor designating hospitals, we want  ~ X + (1|H)
>>>> > >         (plus any
>>>> > >         other fixed effects of interest. (It doesn't make sense /
>>>> isn't
>>>> > >         identifiable to fit a random-slopes model ~ (H | X) because
>>>> > budgets
>>>> > >         don't vary within hospitals.
>>>> > >
>>>> > >         3. X is a categorical / factor predictor (e.g. hospital size
>>>> > class
>>>> > >         {small, medium, large} with multiple hospitals measured in
>>>> each
>>>> > >         size
>>>> > >         class:  ~ X + (1|H) (the same as #2).
>>>> > >
>>>> > >         4. X is a categorical predictor with unique values for each
>>>> > >         hospital
>>>> > >         (e.g. postal code).  Then X is redundant with H, you
>>>> shouldn't
>>>> > >         try to
>>>> > >         include them both in the same model.
>>>> > >
>>>> > >         On 7/10/21 4:55 PM, Jack Solomon wrote:
>>>> > >          > Hello Allo,
>>>> > >          >
>>>> > >          > In my two-level data structure, I have a cluster-level
>>>> > >         variable (called
>>>> > >          > "X"; one that doesn't vary in any cluster). If I intend
>>>> to
>>>> > >         generalize
>>>> > >          > beyond X's current possible levels, then, I should take
>>>> X as
>>>> > >         a random
>>>> > >          > effect.
>>>> > >          >
>>>> > >          > However, because "X" doesn't vary in any cluster,
>>>> therefore,
>>>> > >         such a random
>>>> > >          > effect necessarily must be a crossed random effect
>>>> (e.g., "~
>>>> > >         1 | X"),
>>>> > >          > correct?
>>>> > >          >
>>>> > >          > If yes, then what is "X" crossed with?
>>>> > >          >
>>>> > >          > Thank you,
>>>> > >          > Jack
>>>> > >          >
>>>> > >          >       [[alternative HTML version deleted]]
>>>> > >          >
>>>> > >          > _______________________________________________
>>>> > >          > R-sig-mixed-models using r-project.org
>>>> > >         <mailto:R-sig-mixed-models using r-project.org> mailing list
>>>> > >          > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>> > >         <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>>> > >          >
>>>> > >
>>>> > >         --
>>>> > >         Dr. Benjamin Bolker
>>>> > >         Professor, Mathematics & Statistics and Biology, McMaster
>>>> > University
>>>> > >         Director, School of Computational Science and Engineering
>>>> > >         Graduate chair, Mathematics & Statistics
>>>> > >
>>>> > >         _______________________________________________
>>>> > >         R-sig-mixed-models using r-project.org
>>>> > >         <mailto:R-sig-mixed-models using r-project.org> mailing list
>>>> > >         https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>> > >         <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>>>> > >
>>>> >
>>>> > --
>>>> > Dr. Benjamin Bolker
>>>> > Professor, Mathematics & Statistics and Biology, McMaster University
>>>> > Director, School of Computational Science and Engineering
>>>> > Graduate chair, Mathematics & Statistics
>>>> >
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models using r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>

	[[alternative HTML version deleted]]