[R-sig-ME] Cluster-robust SEs & random effects -- seeking some clarification

Sun Aug 14 23:36:36 CEST 2022

Thank you for these thoughts. One thing that comes to mind in mixed models
for me is that not all mixed models incorporate both random intercepts and
slopes, so there may be a need to account for additional heteroskedasticity
in the case of random-intercepts only models?

There is also the consideration one has to make with predictors that are
factors vs. continuous (where one aggregates over the factor levels to
arrive at an estimate--I think this relates to the absorb command in STATA
whether in fixed or random models).

This paper by Cameron & Miller (which includes STATA relevant code) I also
found valuable, but have had to reread several times to fully appreciate
the complexity of this discourse:
http://jhr.uwpress.org/content/50/2/317.short?casa_token=Bo1aDvwZErEAAAAA:0sIhoGcywMWMXC1mnhqwW8cLNkoY-R8aOXqcMiFGISWYNaRP2nYIhzCbbRJ21GtKdRXa46hK4g

Best regards,
J.D.

On Sun, Aug 14, 2022 at 7:56 AM Fernando Pedro Bruna Quintas <f.bruna using udc.es>
wrote:

> Hi,
>
> I have followed only partially this thread. I also fail to understand the
> point of using clustered errors with mixed models People using Stata often
> calculate those cluster standard errors, at least in my disciplines in
> social sciences, using the "mixed" function. In R that was not possible
> until recently, as far as I know. For instance, tab_model() function from
> sjPlot does not allow calculation of clustered standard errors for models
> estimated with lme4 package. The recent package WeMix allows for that
> calculation, but only because it replicates what Stata mixed function does,
> but the authors do not justify why that calculation may be useful.
>
> The calculation of clustered standard errors is a way of capturing the
> effects of a smaller effective sample size when there are correlations by
> groups. However, that corrections is one of the main concerns of mixed
> modelling, as far as I understand.
>
> Abadie et al. 2022 do not recommend applying clustered standard error by
> default in standard regressions, and I understand that there are still less
> reasons to do it in mixed models.
>
> https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwjVi57ioMb5AhVFnf0HHUAkCwkQFnoECAMQAQ&url=https%3A%2F%2Feconomics.mit.edu%2Ffiles%2F13927&usg=AOvVaw0vtAkWz0AFlwAPOEKYQgNq
>
> I appreciate more comments regarding this topic.
>
> Thank you
>
> Fernando Bruna
> Department of Economics
> Universidade da Corunna, Spain
>
> ------------------------------
> *De:* R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> en
> nombre de J.D. Haltigan <jhaltiga using gmail.com>
> *Enviado:* domingo, 14 de agosto de 2022 1:32
> *Para:* Ben Bolker <bbolker using gmail.com>
> *Cc:* r-sig-mixed-models using r-project.org <r-sig-mixed-models using r-project.org>
> *Asunto:* Re: [R-sig-ME] Cluster-robust SEs & random effects -- seeking
> some clarification
>
> One further post perhaps framing my question slightly differently (or
> altogether more generally):
>
> What, specifically, do cluster-robust/robust SEs allow one to do with more
> accuracy/precision *if* they are already using both random effects and
> slopes to model relevant cluster-specific effects. Is it the case that
> there may be any number of sources that could potentially account for
> sources of heteroskedasticity (i.e., autoregressive structure in the case
> of repeated measurements/time variables) that using the cluster robust SEs
> would be of value for in making more precise inference assuming some
> misspecification of the random effects structure of the model?
>
> Relatedly, is there a 'seminal' or 'key' paper that provides a deep dive on
> the concept of heteroskedasticity? I have a few on hand, but wanted to see
> if there was something I might not be aware of .
>
> On Wed, Aug 3, 2022 at 11:46 PM J.D. Haltigan <jhaltiga using gmail.com> wrote:
>
> > Thought I would bump my last post to see if anyone might weigh in on my
> > more general statements about re: unions (clusters). Would be most
> > appreciated as I continue to wrap my head around some things.
> >
> > Thanks in advance for any thoughts. I appreciate your time.
> >
> > On Sat, Jul 30, 2022 at 8:43 PM J.D. Haltigan <jhaltiga using gmail.com>
> wrote:
> >
> >> And to take one more step: the inference is that the single
> >> population-level treatment effect is drawing from a 'random sample' of
> >> unions (clusters). That is, one can not assume that union's are the
> same.
> >> They are not interchangeable. They are drawn from a random population.
> This
> >> is the point of the exercise for me. If we remove the random effect for
> >> union in the model I shared, we end up with a model in which only the
> fixed
> >> effects of pairID (treatment-control pairs for each pair of treatment
> >> control villages) are estimated (albeit using cluster-robust SEs). So,
> if
> >> by adding that random effect for union the treatment intervention is no
> >> longer significant (as opposed to a model in which there is no random
> >> effect of union modeled), what is that telling us? That some of the
> between
> >> cluster (union) variance in intercepts is contributing to variation in
> the
> >> response variable, yes?
> >>
> >> I realize you have not read the paper, nor are necessarily interested in
> >> this discourse, but any remarks are greatly appreciated.
> >>
> >>
> >> On Sat, Jul 30, 2022 at 8:36 PM Ben Bolker <bbolker using gmail.com> wrote:
> >>
> >>>    Yes.
> >>>
> >>> On 2022-07-30 8:12 p.m., J.D. Haltigan wrote:
> >>> > Thanks, Ben. So in the model you remarked on, would that be a
> >>> > 'random-intercepts only' model?
> >>> >
> >>> >
> >>> > On Sat, Jul 30, 2022 at 7:53 PM Ben Bolker <bbolker using gmail.com
> >>> > <mailto:bbolker using gmail.com <bbolker using gmail.com>>> wrote:
> >>> >
> >>> >     I haven't been following the whole thread that carefully, but I
> >>> want to
> >>> >     emphasize that
> >>> >
> >>> >         posXsymp~treatment + pairID + (1 | union)
> >>> >
> >>> >     is *not*, by any definition I'm familiar with, a "random-slopes
> >>> model";
> >>> >     that is, it only estimates a single population-level treatment
> >>> >     effect/doesn't allow the effect of treatment to vary across
> groups
> >>> >     defined by 'union'.  You would need a random-effect term of the
> >>> form
> >>> >     (treatment | union).
> >>> >
> >>> >         Reasons why you might *not* want to do this:
> >>> >
> >>> >        * if treatment only varies across and not within levels of
> union
> >>> >     ("union is nested within treatment" according to some
> terminology),
> >>> >     then
> >>> >     this variation is unidentifiable
> >>> >        * maybe you have decided that you don't have enough data/want
> a
> >>> more
> >>> >     parsimonious model.
> >>> >
> >>> >         Schielzeth and Forstmeier, among many others (this is the
> >>> example I
> >>> >     know of), have cautioned about the consequences of leaving out
> >>> >     random-slopes terms.
> >>> >
> >>> >     Schielzeth, Holger, and Wolfgang Forstmeier. “Conclusions beyond
> >>> >     Support: Overconfident Estimates in Mixed Models.” Behavioral
> >>> Ecology
> >>> >     20, no. 2 (March 1, 2009): 416–20.
> >>> >
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoi.org%2F10.1093%2Fbeheco%2Farn145&data=05%7C01%7Cf.bruna%40udc.es%7Ccff181a16d084542b64e08da7d843c79%7Ccea1ea3e60b24f75a6c2a6022e8f961b%7C0%7C0%7C637960304201466714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HIiZ5D1zlFIyoL9ZHv%2Fv%2F3sJW6ecsXtCU3wduKtZRS4%3D&reserved=0
> >>> >     <
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoi.org%2F10.1093%2Fbeheco%2Farn145&data=05%7C01%7Cf.bruna%40udc.es%7Ccff181a16d084542b64e08da7d843c79%7Ccea1ea3e60b24f75a6c2a6022e8f961b%7C0%7C0%7C637960304201466714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HIiZ5D1zlFIyoL9ZHv%2Fv%2F3sJW6ecsXtCU3wduKtZRS4%3D&reserved=0
> >.
> >>> >
> >>> >
> >>> >     On 2022-07-30 7:43 p.m., J.D. Haltigan wrote:
> >>> >      > Addendum:
> >>> >      >
> >>> >      > It just occurred to me on my walk that I think I am getting a
> >>> bit
> >>> >     lost in
> >>> >      > some of the differences in nomenclature across scientific
> silos.
> >>> >     In the
> >>> >      > original model that they specified, which treated the 'pairID'
> >>> >     variable as
> >>> >      > a control variable for which they controlled for 'fixed
> >>> effects' of
> >>> >      > control/treatment villages (in their own language in the
> paper)
> >>> using
> >>> >      > cluster-robust SEs, I think this is indeed a
> 'random-intercepts
> >>> >     only' model
> >>> >      > in the language of Hamaker et al. They implement the 'absorb'
> >>> >     command in
> >>> >      > STATA which I believe aggregates across the pairIDs to
> generate
> >>> an
> >>> >      > 'omnibus' F-test of sorts for the pairID variable (in the
> ANOVA
> >>> >      > nomenclature). I say this as when I specify the pairID
> variable
> >>> >     in the lmer
> >>> >      > model I shared (or in a fixest model I conducted to replicate
> >>> the
> >>> >     original
> >>> >      > Abalauck results in R), I get the estimates for all the pairs
> >>> >     (i.e., there
> >>> >      > is no way to aggregate across them--though I think formally
> the
> >>> >     models are
> >>> >      > the same if we are unconcerned about any one pairID
> >>> >     [treatment/control
> >>> >      > village pair].
> >>> >      >
> >>> >      > So, in the lmer model I shared where I specify a specific
> random
> >>> >     effects
> >>> >      > term for the 'cluster' variable, I think this indeed is
> allowing
> >>> >     for random
> >>> >      > slopes across the clusters which implies the treatment effect
> >>> may
> >>> >     vary
> >>> >      > across the clusters (and we might anticipate it will for
> various
> >>> >     reasons I
> >>> >      > can elaborate on). More generally: we are generalizing to
> *any*
> >>> >     universe of
> >>> >      > villages (say in the entire world) where the treatment
> >>> >     intervention (masks)
> >>> >      > may vary across villages. This is the crux of invoking the
> >>> random
> >>> >     effects
> >>> >      > model (i.e., random slopes model).
> >>> >      >
> >>> >      > I realize this is a mouthful, but I think the way these terms
> >>> (e.g.,
> >>> >      > random/fixed effects models etc.) are used across disciplines
> >>> >     makes things
> >>> >      > a bit confusing.
> >>> >      >
> >>> >      > On Sat, Jul 30, 2022 at 5:25 PM J.D. Haltigan <
> >>> jhaltiga using gmail.com
> >>> >     <mailto:jhaltiga using gmail.com <jhaltiga using gmail.com>>> wrote:
> >>> >      >
> >>> >      >> This is a very helpful walkthrough, James. My responses are
> >>> >     italicized
> >>> >      >> under yours to maintain thread readability. The key is
> >>> >     Generalizability
> >>> >      >> here and (as I also note in my last reply) the idea is to
> >>> >     Generalize to a
> >>> >      >> universe of "any villages or clusters." That is, the target
> >>> >     population we
> >>> >      >> are generalizing to is *any* random population.
> >>> >      >>
> >>> >      >> On Sat, Jul 30, 2022 at 3:01 PM James Pustejovsky
> >>> >     <jepusto using gmail.com <mailto:jepusto using gmail.com <jepusto using gmail.com>
> >>
> >>> >      >> wrote:
> >>> >      >>
> >>> >      >>> Hi J.D.,
> >>> >      >>> A few comments/reactions inline below.
> >>> >      >>> James
> >>> >      >>>
> >>> >      >>> On Wed, Jul 27, 2022 at 5:37 PM J.D. Haltigan
> >>> >     <jhaltiga using gmail.com <mailto:jhaltiga using gmail.com
> <jhaltiga using gmail.com>>> wrote:
> >>> >      >>>
> >>> >      >>>> ...
> >>> >      >>>>
> >>> >      >>> In the original investigation, the authors did not invoke a
> >>> random
> >>> >      >>>> effects model (but did use the pairIDs to control for fixed
> >>> >     effects as
> >>> >      >>>> noted and with robust SEs). Thus, in the original
> >>> >     investigation there was
> >>> >      >>>> *no* specification of a random effects model for the
> >>> 'cluster'
> >>> >     variable. We
> >>> >      >>>> know from some other work there were some biases in village
> >>> >     mapping and
> >>> >      >>>> other possible sources of between-cluster variation that
> >>> might be
> >>> >      >>>> anticipated to have influence--at the random intercepts
> >>> >     level--so we are
> >>> >      >>>> looking into how specifying 'cluster' as a random effect
> >>> might
> >>> >     change the
> >>> >      >>>> fixed effects estimates for the treatment intervention
> >>> effect.
> >>> >     In the
> >>> >      >>>> Hamaker et al. language, it is indeed a 'random intercepts'
> >>> >     only model.
> >>> >      >>>>
> >>> >      >>>
> >>> >      >>> I don't follow how using a random intercepts model improves
> >>> the
> >>> >      >>> generalizability warrant here. The random intercepts model
> is
> >>> >     essentially
> >>> >      >>> just a re-weighted average of the pair-specific effects in
> the
> >>> >     original
> >>> >      >>> analysis, where the weights are optimally efficient if the
> >>> model is
> >>> >      >>> correctly specified. That last clause carries a lot of
> weight
> >>> >     here--correct
> >>> >      >>> specification means 1) treatment assignment is unrelated to
> >>> the
> >>> >     random
> >>> >      >>> effects, 2) the treatment effect is constant across
> clusters,
> >>> 3)
> >>> >      >>> distributional assumptions are valid (i.e., homoskedasticity
> >>> at
> >>> >     each level
> >>> >      >>> of the model).
> >>> >      >>>
> >>> >      >>> If the effects are heterogeneous, then I would think that
> >>> including
> >>> >      >>> random slopes on the treatment indicator would provide a
> >>> better
> >>> >     basis for
> >>> >      >>> generalization. But even then, the warrant is still pretty
> >>> >     vague---what is
> >>> >      >>> the hypothetical population of villages from which the
> >>> observed
> >>> >     villages
> >>> >      >>> are sampled?
> >>> >      >>>
> >>> >      >>
> >>> >      >> *In the most basic model (without baseline controls) the
> model
> >>> >     takes the
> >>> >      >> form: myModel = lmer(posXsymp~treatment + pairID + (1 |
> union),
> >>> >     data =
> >>> >      >> myData). I believe--correct me if I am wrong--that this
> >>> reflects a
> >>> >      >> random-intercepts only model, but I may be mistaken. If I am,
> >>> >     and this is
> >>> >      >> allowing for random slopes on the treatment indicator, then I
> >>> >     will need to
> >>> >      >> rethink my statements.  *
> >>> >      >>
> >>> >      >>>
> >>> >      >>>
> >>> >      >>>> Given this, however, does it also make sense to include the
> >>> >     cluster
> >>> >      >>>> robust SEs for the fixed effects which would account for
> >>> possible
> >>> >      >>>> heterogeneity of treatment effects (i.e., slopes) across
> >>> >     clusters?s
> >>> >      >>>>
> >>> >      >>>> If you're committed to the random intercepts model, then
> yes
> >>> I
> >>> >     think so
> >>> >      >>> because using cluster robust SEs at least acknowledges the
> >>> >     possibility of
> >>> >      >>> heterogeneous treatment effects.
> >>> >      >>>
> >>> >      >>
> >>> >      >> *If the above model does allow for both random intercepts and
> >>> >     slopes, then
> >>> >      >> perhaps the use of cluster robust SEs is redundant in some
> >>> sense
> >>> >     since the
> >>> >      >> random slopes would be modeling the heterogeneity in
> treatment
> >>> >     effects?*
> >>> >      >>
> >>> >      >>>
> >>> >      >>>
> >>> >      >>>
> >>> >      >>>> Bottom line: in their original analyses, clusters are seen
> as
> >>> >      >>>> interchangeable from a conceptual perspective (rather than
> >>> >     drawn from a
> >>> >      >>>> random universe of observations). When one scales up
> evidence
> >>> >     to a universe
> >>> >      >>>> of observations that are random (as they would be in the
> >>> >     intended universe
> >>> >      >>>> of inference in the real-world), then we are better
> >>> >     positioned, I think, to
> >>> >      >>>> adjudicate whether the mask intervention effect is
> >>> 'practically
> >>> >      >>>> significant' (in addition to whether the focal effect
> remains
> >>> >     marginally
> >>> >      >>>> significant from a frequentist perspective).
> >>> >      >>>>
> >>> >      >>> As noted above, this argument is a bit vague to me. If
> there's
> >>> >     concern
> >>> >      >>> about generalizability, then my first question would be:
> what
> >>> >     is the target
> >>> >      >>> population to which you are trying to generalize?
> >>> >      >>>
> >>> >      >>
> >>> >      >> *Essentially, the target population we are trying to
> generalize
> >>> >     to is a
> >>> >      >> random selection of villages. Any random selection of
> villages.
> >>> >     In other
> >>> >      >> words, villages should not be seen as interchangeable. We are
> >>> >     interested in
> >>> >      >> whether the effects generalize to any randomly selected
> >>> village. *
> >>> >      >>
> >>> >      >>>
> >>> >      >>>
> >>> >      >>
> >>> >      >
> >>> >      >       [[alternative HTML version deleted]]
> >>> >      >
> >>> >      > _______________________________________________
> >>> >      > R-sig-mixed-models using r-project.org
> >>> >     <mailto:R-sig-mixed-models using r-project.org
> <R-sig-mixed-models using r-project.org>> mailing list
> >>> >      >
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=05%7C01%7Cf.bruna%40udc.es%7Ccff181a16d084542b64e08da7d843c79%7Ccea1ea3e60b24f75a6c2a6022e8f961b%7C0%7C0%7C637960304201466714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YM5skHwDtAoKz1KVXrCB1bjBRLvTE8UMuZjaE964iBY%3D&reserved=0
> >>> >     <
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=05%7C01%7Cf.bruna%40udc.es%7Ccff181a16d084542b64e08da7d843c79%7Ccea1ea3e60b24f75a6c2a6022e8f961b%7C0%7C0%7C637960304201466714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YM5skHwDtAoKz1KVXrCB1bjBRLvTE8UMuZjaE964iBY%3D&reserved=0
> >
> >>> >
> >>> >     --
> >>> >     Dr. Benjamin Bolker
> >>> >     Professor, Mathematics & Statistics and Biology, McMaster
> >>> University
> >>> >     Director, School of Computational Science and Engineering
> >>> >     (Acting) Graduate chair, Mathematics & Statistics
> >>> >
> >>> >     _______________________________________________
> >>> >     R-sig-mixed-models using r-project.org
> >>> >     <mailto:R-sig-mixed-models using r-project.org
> <R-sig-mixed-models using r-project.org>> mailing list
> >>> >
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=05%7C01%7Cf.bruna%40udc.es%7Ccff181a16d084542b64e08da7d843c79%7Ccea1ea3e60b24f75a6c2a6022e8f961b%7C0%7C0%7C637960304201466714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YM5skHwDtAoKz1KVXrCB1bjBRLvTE8UMuZjaE964iBY%3D&reserved=0
> >>> >     <
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=05%7C01%7Cf.bruna%40udc.es%7Ccff181a16d084542b64e08da7d843c79%7Ccea1ea3e60b24f75a6c2a6022e8f961b%7C0%7C0%7C637960304201466714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YM5skHwDtAoKz1KVXrCB1bjBRLvTE8UMuZjaE964iBY%3D&reserved=0
> >
> >>> >
> >>>
> >>> --
> >>> Dr. Benjamin Bolker
> >>> Professor, Mathematics & Statistics and Biology, McMaster University
> >>> Director, School of Computational Science and Engineering
> >>> (Acting) Graduate chair, Mathematics & Statistics
> >>>
> >>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
>
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-mixed-models&data=05%7C01%7Cf.bruna%40udc.es%7Ccff181a16d084542b64e08da7d843c79%7Ccea1ea3e60b24f75a6c2a6022e8f961b%7C0%7C0%7C637960304201466714%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YM5skHwDtAoKz1KVXrCB1bjBRLvTE8UMuZjaE964iBY%3D&reserved=0
>

	[[alternative HTML version deleted]]