[R-meta] Studies with more than one control group

Thu Jul 22 00:46:01 CEST 2021

Thank you very much for your confirmation and linking that fantastic
exchange.

Best wishes,
Jack

On Wed, Jul 21, 2021 at 2:46 PM James Pustejovsky <jepusto using gmail.com> wrote:

> Okay thanks for clarifying.
>
> In this case, then using random effects per control group implies that
> we're aiming to generalize to a population of studies, each of which could
> involve one or possibly multiple control groups. Formally, the overall
> average effect size parameter represents the mean of a set of
> study-specific average effect size parameters, and those study-specific
> average effect size parameters represent means from a distribution of
> effects across a hypothetical set of possible control groups.
>
> My interpretation here is related to another recent exchange on the
> listserv about random effects and sampling:
> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2021-July/002994.html
>
>
> On Wed, Jul 21, 2021 at 10:56 AM Jack Solomon <kj.jsolomon using gmail.com>
> wrote:
>
>> Oh sorry, yes they are simply labels (I thought all ID variables are
>> labels). In our case, none of the control groups are waitlist groups.
>> Basically, due to the criticisms in the literature regarding the definition
>> of the control groups, some newer studies have included two "active"
>> control groups (one doing X, the other doing Y) and then they benchmark
>> their treated groups against both these control groups to show that doing X
>> vs. Y has no bearing on the final result.
>>
>> I hope my clarification helps,
>> Thanks again,
>> Jack
>>
>> On Wed, Jul 21, 2021 at 10:43 AM James Pustejovsky <jepusto using gmail.com>
>> wrote:
>>
>>> I am still wondering whether control 1 versus control 2 has a specific
>>> meaning. For example, perhaps controlID = 1 means that the study used a
>>> wait-list control group, whereas controlID = 2 means that the study used an
>>> attentional control group. Is this the case? Or is controlID just an
>>> arbitrary set of labels, where you could have replaced the numerical values
>>> as follows without losing any information?
>>>
>>> studyID  yi  controlID
>>> 1        .1      A
>>> 1        .2      B
>>> 1        .3      A
>>> 1        .4      B
>>> 2        .5      C
>>> 2        .6      D
>>> 3        .7      E
>>>
>>> On Wed, Jul 21, 2021 at 10:29 AM Jack Solomon <kj.jsolomon using gmail.com>
>>> wrote:
>>>
>>>> Dear James,
>>>>
>>>> Thank you for your reply. "controlID" distinguishes between effect
>>>> sizes (SMDs in this case) that have been obtained by comparing the treated
>>>> groups to control 1 vs. control 2 (see below).
>>>>
>>>> I was wondering if adding such an ID variable (just like schoolID) and
>>>> the random effect associated with it would also mean that we are
>>>> generalizing beyond the levels of controlID, which then, would mean that we
>>>> anticipate that each study 'could' have any number of control groups and
>>>> not just limited to a max of 2?
>>>>
>>>> Thanks again, Jack
>>>>
>>>> studyID  yi  controlID
>>>> 1        .1      1
>>>> 1        .2      2
>>>> 1        .3      1
>>>> 1        .4      2
>>>> 2        .5      1
>>>> 2        .6      2
>>>> 3        .7      1
>>>>
>>>> On Wed, Jul 21, 2021 at 10:13 AM James Pustejovsky <jepusto using gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Jack,
>>>>>
>>>>> To make sure I follow the structure of your data, let me ask: Do
>>>>> controlID = 1 or controlID = 2 correspond to specific *types* of control
>>>>> groups that have the same meaning across all of your studies? Or is this
>>>>> just an arbitrary ID variable?
>>>>>
>>>>> In my earlier response, I was assuming that controlID in your data is
>>>>> just an ID variable. Using random effects specified as
>>>>> ~ | studyID/controlID
>>>>> means that you're including random *intercept* terms for each unique
>>>>> control group nested within studyID. It has nothing to do with the number
>>>>> of control groups.
>>>>>
>>>>> James
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 19, 2021 at 10:56 PM Jack Solomon <kj.jsolomon using gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Dear James,
>>>>>>
>>>>>> I'm coming back to this after a while (preparing the data). A quick
>>>>>> follow-up. So, you mentioned that if I have several studies that have used
>>>>>> more than 1 control group (in my data up to 2), I can possibly add a
>>>>>> random-effect (controlID) to capture any heterogeneity in the effect sizes
>>>>>> across control groups nested within studies.
>>>>>>
>>>>>> My question is that adding a controlID random-effect (a binary
>>>>>> indicator: 1 or 2) would also mean that we intend to generalize beyond the
>>>>>> possible number of control groups that a study can employ (for my data
>>>>>> beyond 2 control groups)?
>>>>>>
>>>>>> Thank you,
>>>>>> Jack
>>>>>>
>>>>>> On Thu, Jun 24, 2021 at 4:52 PM Jack Solomon <kj.jsolomon using gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank you very much for the clarification. That makes perfect sense.
>>>>>>>
>>>>>>> Jack
>>>>>>>
>>>>>>> On Thu, Jun 24, 2021 at 4:44 PM James Pustejovsky <jepusto using gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> The random effect for controlID is capturing any heterogeneity in
>>>>>>>> the effect sizes across control groups nested within studies, *above and
>>>>>>>> beyond heterogeneity explained by covariates.* Thus, if you include a
>>>>>>>> covariate to distinguish among types of control groups, and the differences
>>>>>>>> between types of control groups are consistent across studies, then the
>>>>>>>> covariate might explain all (or nearly all) of the variation at that level,
>>>>>>>> which would obviate the purpose of including the random effect at that
>>>>>>>> level.
>>>>>>>>
>>>>>>>> On Thu, Jun 24, 2021 at 9:56 AM Jack Solomon <kj.jsolomon using gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thank you James. On my question 3, I was implicitly referring to
>>>>>>>>> my previous question (a previous post titled: Studies with independent
>>>>>>>>> samples) regarding the fact that if I decide to drop 'sampleID', then I
>>>>>>>>> need to change the coding of the 'studyID' column (i.e., then, each sample
>>>>>>>>> should be coded as an independent study). So, in my question 3, I really
>>>>>>>>> was asking that in the case of 'controlID', removing it doesn't require
>>>>>>>>> changing the coding of any other columns in my data.
>>>>>>>>>
>>>>>>>>> Regarding adding 'controlID' as a random effect, you said: "... an
>>>>>>>>> additional random effect for controlID will depend on how many studies
>>>>>>>>> include multiple control groups and whether the model includes a covariate
>>>>>>>>> to distinguish among types of control groups (e.g., business-as-usual
>>>>>>>>> versus waitlist versus active control group)."
>>>>>>>>>
>>>>>>>>> I understand that the number of studies with multiple control
>>>>>>>>> groups is important in whether to add a random effect or not. But why
>>>>>>>>> having "a covariate to distinguish among types of control groups" is
>>>>>>>>> important in whether to add a random effect or not?
>>>>>>>>>
>>>>>>>>> Thanks, Jack
>>>>>>>>>
>>>>>>>>> On Thu, Jun 24, 2021 at 9:17 AM James Pustejovsky <
>>>>>>>>> jepusto using gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Jack,
>>>>>>>>>>
>>>>>>>>>> Responses inline below.
>>>>>>>>>>
>>>>>>>>>> James
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I have come across a couple of primary studies in my
>>>>>>>>>>> meta-analytic pool
>>>>>>>>>>> that have used two comparison/control groups (as the definition
>>>>>>>>>>> of
>>>>>>>>>>> 'control' has been debated in the literature I'm meta-analyzing).
>>>>>>>>>>>
>>>>>>>>>>> (1) Given that, should I create an additional column ('control')
>>>>>>>>>>> to
>>>>>>>>>>> distinguish between effect sizes (SMDs in this case) that have
>>>>>>>>>>> been
>>>>>>>>>>> obtained by comparing the treated groups to control 1 vs.
>>>>>>>>>>> control 2 (see
>>>>>>>>>>> below)?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Yes. Along the same lines as my response to your earlier
>>>>>>>>>> question, it seems prudent to include ID variables like this in order to
>>>>>>>>>> describe the structure of the included studies.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> (2) If yes, then, does the addition of a 'control' column call
>>>>>>>>>>> for the
>>>>>>>>>>> addition of a random effect for 'control' of the form:  "~ |
>>>>>>>>>>> studyID/controlID" (to be empirically tested)?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> I expect you will find differences of opinion here.
>>>>>>>>>> Pragmatically, the feasibility of estimating a model with an additional
>>>>>>>>>> random effect for controlID will depend on how many studies include
>>>>>>>>>> multiple control groups and whether the model includes a covariate to
>>>>>>>>>> distinguish among types of control groups (e.g., business-as-usual versus
>>>>>>>>>> waitlist versus active control group).
>>>>>>>>>>
>>>>>>>>>> At a conceptual level, omitting random effects for controlID
>>>>>>>>>> leads to essentially the same results as averaging the ES across both
>>>>>>>>>> control groups. If averaging like this makes conceptual sense, then
>>>>>>>>>> omitting the random effects might be reasonable.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> (3) If I later decide to drop controlID from my dataset, I think
>>>>>>>>>>> I can
>>>>>>>>>>> still keep all effect sizes from both control groups intact
>>>>>>>>>>> without any
>>>>>>>>>>> changes to my coding scheme, right?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I don't understand what you're concern is here. Why not just keep
>>>>>>>>>> controlID in your dataset as a descriptor, even if it doesn't get used in
>>>>>>>>>> the model?
>>>>>>>>>>
>>>>>>>>>

	[[alternative HTML version deleted]]