[R-meta] Studies with more than one control group

Wed Jul 21 21:46:12 CEST 2021

Okay thanks for clarifying.

In this case, then using random effects per control group implies that
we're aiming to generalize to a population of studies, each of which could
involve one or possibly multiple control groups. Formally, the overall
average effect size parameter represents the mean of a set of
study-specific average effect size parameters, and those study-specific
average effect size parameters represent means from a distribution of
effects across a hypothetical set of possible control groups.

My interpretation here is related to another recent exchange on the
listserv about random effects and sampling:
https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2021-July/002994.html

On Wed, Jul 21, 2021 at 10:56 AM Jack Solomon <kj.jsolomon using gmail.com> wrote:

> Oh sorry, yes they are simply labels (I thought all ID variables are
> labels). In our case, none of the control groups are waitlist groups.
> Basically, due to the criticisms in the literature regarding the definition
> of the control groups, some newer studies have included two "active"
> control groups (one doing X, the other doing Y) and then they benchmark
> their treated groups against both these control groups to show that doing X
> vs. Y has no bearing on the final result.
>
> I hope my clarification helps,
> Thanks again,
> Jack
>
> On Wed, Jul 21, 2021 at 10:43 AM James Pustejovsky <jepusto using gmail.com>
> wrote:
>
>> I am still wondering whether control 1 versus control 2 has a specific
>> meaning. For example, perhaps controlID = 1 means that the study used a
>> wait-list control group, whereas controlID = 2 means that the study used an
>> attentional control group. Is this the case? Or is controlID just an
>> arbitrary set of labels, where you could have replaced the numerical values
>> as follows without losing any information?
>>
>> studyID  yi  controlID
>> 1        .1      A
>> 1        .2      B
>> 1        .3      A
>> 1        .4      B
>> 2        .5      C
>> 2        .6      D
>> 3        .7      E
>>
>> On Wed, Jul 21, 2021 at 10:29 AM Jack Solomon <kj.jsolomon using gmail.com>
>> wrote:
>>
>>> Dear James,
>>>
>>> Thank you for your reply. "controlID" distinguishes between effect sizes
>>> (SMDs in this case) that have been obtained by comparing the treated groups
>>> to control 1 vs. control 2 (see below).
>>>
>>> I was wondering if adding such an ID variable (just like schoolID) and
>>> the random effect associated with it would also mean that we are
>>> generalizing beyond the levels of controlID, which then, would mean that we
>>> anticipate that each study 'could' have any number of control groups and
>>> not just limited to a max of 2?
>>>
>>> Thanks again, Jack
>>>
>>> studyID  yi  controlID
>>> 1        .1      1
>>> 1        .2      2
>>> 1        .3      1
>>> 1        .4      2
>>> 2        .5      1
>>> 2        .6      2
>>> 3        .7      1
>>>
>>> On Wed, Jul 21, 2021 at 10:13 AM James Pustejovsky <jepusto using gmail.com>
>>> wrote:
>>>
>>>> Hi Jack,
>>>>
>>>> To make sure I follow the structure of your data, let me ask: Do
>>>> controlID = 1 or controlID = 2 correspond to specific *types* of control
>>>> groups that have the same meaning across all of your studies? Or is this
>>>> just an arbitrary ID variable?
>>>>
>>>> In my earlier response, I was assuming that controlID in your data is
>>>> just an ID variable. Using random effects specified as
>>>> ~ | studyID/controlID
>>>> means that you're including random *intercept* terms for each unique
>>>> control group nested within studyID. It has nothing to do with the number
>>>> of control groups.
>>>>
>>>> James
>>>>
>>>>
>>>>
>>>> On Mon, Jul 19, 2021 at 10:56 PM Jack Solomon <kj.jsolomon using gmail.com>
>>>> wrote:
>>>>
>>>>> Dear James,
>>>>>
>>>>> I'm coming back to this after a while (preparing the data). A quick
>>>>> follow-up. So, you mentioned that if I have several studies that have used
>>>>> more than 1 control group (in my data up to 2), I can possibly add a
>>>>> random-effect (controlID) to capture any heterogeneity in the effect sizes
>>>>> across control groups nested within studies.
>>>>>
>>>>> My question is that adding a controlID random-effect (a binary
>>>>> indicator: 1 or 2) would also mean that we intend to generalize beyond the
>>>>> possible number of control groups that a study can employ (for my data
>>>>> beyond 2 control groups)?
>>>>>
>>>>> Thank you,
>>>>> Jack
>>>>>
>>>>> On Thu, Jun 24, 2021 at 4:52 PM Jack Solomon <kj.jsolomon using gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you very much for the clarification. That makes perfect sense.
>>>>>>
>>>>>> Jack
>>>>>>
>>>>>> On Thu, Jun 24, 2021 at 4:44 PM James Pustejovsky <jepusto using gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> The random effect for controlID is capturing any heterogeneity in
>>>>>>> the effect sizes across control groups nested within studies, *above and
>>>>>>> beyond heterogeneity explained by covariates.* Thus, if you include a
>>>>>>> covariate to distinguish among types of control groups, and the differences
>>>>>>> between types of control groups are consistent across studies, then the
>>>>>>> covariate might explain all (or nearly all) of the variation at that level,
>>>>>>> which would obviate the purpose of including the random effect at that
>>>>>>> level.
>>>>>>>
>>>>>>> On Thu, Jun 24, 2021 at 9:56 AM Jack Solomon <kj.jsolomon using gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you James. On my question 3, I was implicitly referring to my
>>>>>>>> previous question (a previous post titled: Studies with independent
>>>>>>>> samples) regarding the fact that if I decide to drop 'sampleID', then I
>>>>>>>> need to change the coding of the 'studyID' column (i.e., then, each sample
>>>>>>>> should be coded as an independent study). So, in my question 3, I really
>>>>>>>> was asking that in the case of 'controlID', removing it doesn't require
>>>>>>>> changing the coding of any other columns in my data.
>>>>>>>>
>>>>>>>> Regarding adding 'controlID' as a random effect, you said: "... an
>>>>>>>> additional random effect for controlID will depend on how many studies
>>>>>>>> include multiple control groups and whether the model includes a covariate
>>>>>>>> to distinguish among types of control groups (e.g., business-as-usual
>>>>>>>> versus waitlist versus active control group)."
>>>>>>>>
>>>>>>>> I understand that the number of studies with multiple control
>>>>>>>> groups is important in whether to add a random effect or not. But why
>>>>>>>> having "a covariate to distinguish among types of control groups" is
>>>>>>>> important in whether to add a random effect or not?
>>>>>>>>
>>>>>>>> Thanks, Jack
>>>>>>>>
>>>>>>>> On Thu, Jun 24, 2021 at 9:17 AM James Pustejovsky <
>>>>>>>> jepusto using gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Jack,
>>>>>>>>>
>>>>>>>>> Responses inline below.
>>>>>>>>>
>>>>>>>>> James
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I have come across a couple of primary studies in my
>>>>>>>>>> meta-analytic pool
>>>>>>>>>> that have used two comparison/control groups (as the definition of
>>>>>>>>>> 'control' has been debated in the literature I'm meta-analyzing).
>>>>>>>>>>
>>>>>>>>>> (1) Given that, should I create an additional column ('control')
>>>>>>>>>> to
>>>>>>>>>> distinguish between effect sizes (SMDs in this case) that have
>>>>>>>>>> been
>>>>>>>>>> obtained by comparing the treated groups to control 1 vs. control
>>>>>>>>>> 2 (see
>>>>>>>>>> below)?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Yes. Along the same lines as my response to your earlier question,
>>>>>>>>> it seems prudent to include ID variables like this in order to describe the
>>>>>>>>> structure of the included studies.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> (2) If yes, then, does the addition of a 'control' column call
>>>>>>>>>> for the
>>>>>>>>>> addition of a random effect for 'control' of the form:  "~ |
>>>>>>>>>> studyID/controlID" (to be empirically tested)?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I expect you will find differences of opinion here. Pragmatically,
>>>>>>>>> the feasibility of estimating a model with an additional random effect for
>>>>>>>>> controlID will depend on how many studies include multiple control groups
>>>>>>>>> and whether the model includes a covariate to distinguish among types of
>>>>>>>>> control groups (e.g., business-as-usual versus waitlist versus active
>>>>>>>>> control group).
>>>>>>>>>
>>>>>>>>> At a conceptual level, omitting random effects for controlID leads
>>>>>>>>> to essentially the same results as averaging the ES across both control
>>>>>>>>> groups. If averaging like this makes conceptual sense, then omitting the
>>>>>>>>> random effects might be reasonable.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> (3) If I later decide to drop controlID from my dataset, I think
>>>>>>>>>> I can
>>>>>>>>>> still keep all effect sizes from both control groups intact
>>>>>>>>>> without any
>>>>>>>>>> changes to my coding scheme, right?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I don't understand what you're concern is here. Why not just keep
>>>>>>>>> controlID in your dataset as a descriptor, even if it doesn't get used in
>>>>>>>>> the model?
>>>>>>>>>
>>>>>>>>

	[[alternative HTML version deleted]]