[R-meta] "Categorical" moderator varying within and between studies

Simon Harmel @|m@h@rme| @end|ng |rom gm@||@com
Sun Jun 7 06:07:15 CEST 2020


Many thanks James. A quick follow-up. The strategy that you described is a
general, regression modeling strategy, right? I mean even if we were
fitting a multi-level model, the fixed-effects part of the formula had to
include the same construction of (i.e., *b1 (% female-within)_ij + b2 (%
female-between)_j*) in it?

Thanks,
Simon

On Thu, Jun 4, 2020 at 9:42 AM James Pustejovsky <jepusto using gmail.com> wrote:

> Hi Simon,
>
> Please keep the listserv cc'd so that others can benefit from these
> discussions.
>
> Unfortunately, I don't think there is any single answer to your
> question---analytic strategies just depend too much on what your research
> questions are and the substantive context that you're working in.
>
> But speaking generally, the advantages of splitting predictors into
> within- and between-study versions are two-fold. First is that doing this
> provides an understanding of the structure of the data you're working with,
> in that it forces one to consider *which* predictors have within-study
> variation and *how much *variation there is (e.g., perhaps many studies
> have looked at internalizing symptoms, many studies have looked at
> externalizing symptoms, but only a few have looked at both types of
> outcomes in the same sample). The second advantage is that within-study
> predictors have a distinct interpretation from between-study predictors,
> and the within-study version is often theoretically more
> interesting/salient. That's because comparisons of effect sizes based on
> within-study variation hold constant other aspects of the studies that
> could influence effect size (and that could muddy the interpretation of the
> moderator).
>
> Here is an example that comes up often in research synthesis projects.
> Suppose that you're interested in whether participant sex moderates the
> effect of some intervention. Most of the studies in the sample are of type
> A, such that only aggregated effect sizes can be calculated. For these type
> A studies, we are able to determine a) the average effect size across the
> full sample (pooling across sex) and b) the sex composition of the sample
> (e.g., % female). For a smaller number of studies of type B, we are able to
> obtain dis-aggregated results for subgroups of male and female
> participants. For these studies, we are able to determine a) the average
> effect size for males and b) the average effect size for females, plus c)
> the sex composition of each of the sub-samples (respectively 0% and 100%
> female).
>
> Without considering within/between variation in the predictor, a
> meta-regression testing for whether sex is a moderator is:
>
> Y_ij = b0 + b1 (% female)_ij + e_ij
>
> The coefficient b1 describes how effect size magnitude varies across
> samples that differ by 1% in the percent of females. But the estimate of
> this coefficient pools information across studies of type A and studies of
> type B, essentially assuming that the contextual effects (variance
> explained by sample composition) are the same as the individual-level
> moderator effects (how the intervention effect varies between males and
> females).
>
> Now, if we use the within/between decomposition, the meta-regression
> becomes:
>
> Y_ij = b0 + b1 (% female-within)_ij + b2 (% female-between)_j + e_ij
>
> In this model, b1 will be estimated *using only the studies of type B*,
> as an average of the moderator effects for the studies that provide
> dis-aggregated data. And b2 will be estimated using studies of type A and
> the study-level average % female in studies of type B. Thus b2 can be
> interpreted as a pure contextual effect (variance explained by sample
> composition). Why does this matter? It's because contextual effects usually
> have a much murkier interpretation than individual-level moderator effects.
> Maybe this particular intervention has been tested for several different
> professions (e.g., education, nursing, dentistry, construction), and
> professions that tend to have higher proportions of females are also those
> that tend to be lower-status. If there is a positive contextual effect for
> % female, then it might be that a) the intervention really is more
> effective for females than for males or b) the intervention is equally
> effective for males and females but tends to work better when used with
> lower-status professions. Looking at between/within study variance in the
> predictor lets us disentangle those possibilities, at least partially.
>
> James
>
> On Wed, Jun 3, 2020 at 9:27 AM Simon Harmel <sim.harmel using gmail.com> wrote:
>
>> Indeed that was the problem, Greta, Thanks.
>>
>> But James, in meta-analysis having multiple categorical variables each
>> with several levels is very pervasive and they often vary both within and
>> between studies.
>>
>> So, if for each level of each of such categorical variables we need to do
>> this, this would certainly become a daunting task in addition to making
>> the model extremely big.
>>
>> My follow-up question is what is your strategy after you create
>> within and between dummies for each of such categorical variables? What are
>> the next steps?
>>
>> Thank you very much, Simon
>>
>> p.s. After your `robu()` call I get: `Warning message: In sqrt(eigenval)
>> : NaNs produced`
>>
>> On Wed, Jun 3, 2020 at 8:45 AM Gerta Ruecker <
>> ruecker using imbi.uni-freiburg.de> wrote:
>>
>>> Simon
>>>
>>> Maybe there should not be a line break between "Relative and Rating"?
>>>
>>> For characters, for example if they are used as legends, line breaks
>>> sometimes matter.
>>>
>>> Best,
>>>
>>> Gerta
>>>
>>> Am 03.06.2020 um 15:32 schrieb James Pustejovsky:
>>> > I'm not sure what produced that error and I cannot reproduce it. It may
>>> > have to do something with the version of dplyr. Here's an alternative
>>> way
>>> > to recode the Scoring variable, which might be less prone to versioning
>>> > differences:
>>> >
>>> > library(dplyr)
>>> > library(fastDummies)
>>> > library(robumeta)
>>> >
>>> > data("oswald2013")
>>> >
>>> > oswald_centered <-
>>> >    oswald2013 %>%
>>> >
>>> >    # make dummy variables
>>> >    mutate(
>>> >      Scoring = factor(Scoring,
>>> >                       levels = c("Absolute", "Difference Score",
>>> "Relative
>>> > Rating"),
>>> >                       labels = c("Absolute", "Difference", "Relative"))
>>> >    ) %>%
>>> >    dummy_columns(select_columns = "Scoring") %>%
>>> >
>>> >    # centering by study
>>> >    group_by(Study) %>%
>>> >    mutate_at(vars(starts_with("Scoring_")),
>>> >              list(wthn = ~ . - mean(.), btw = ~ mean(.))) %>%
>>> >
>>> >    # calculate Fisher Z and variance
>>> >    mutate(
>>> >      Z = atanh(R),
>>> >      V = 1 / (N - 3)
>>> >    )
>>> >
>>> >
>>> > # Use the predictors in a meta-regression model
>>> > # with Scoring = Absolute as the omitted category
>>> >
>>> > robu(Z ~ Scoring_Difference_wthn + Scoring_Relative_wthn +
>>> >         Scoring_Difference_btw + Scoring_Relative_btw,
>>> >       data = oswald_centered, studynum = Study, var.eff.size = V)
>>> >
>>> > On Tue, Jun 2, 2020 at 10:20 PM Simon Harmel <sim.harmel using gmail.com>
>>> wrote:
>>> >
>>> >> Many thanks, James! I keep getting the following error when I run your
>>> >> code:
>>> >>
>>> >> Error: unexpected symbol in:
>>> >> "Rating" = "Relative")
>>> >> oswald_centered"
>>> >>
>>> >> On Tue, Jun 2, 2020 at 10:00 PM James Pustejovsky <jepusto using gmail.com>
>>> >> wrote:
>>> >>
>>> >>> Hi Simon,
>>> >>>
>>> >>> The same strategy can be followed by using dummy variables for each
>>> >>> unique level of a categorical moderator. The idea would be to 1)
>>> create
>>> >>> dummy variables for each category, 2) calculate the study-level
>>> means of
>>> >>> the dummy variables (between-cluster predictors), and 3) calculate
>>> the
>>> >>> group-mean centered dummy variables (within-cluster predictors).
>>> Just like
>>> >>> if you're working with regular categorical predictors, you'll have
>>> to pick
>>> >>> one reference level to omit when using these sets of predictors.
>>> >>>
>>> >>> Here is an example of how to carry out such calculations in R, using
>>> the
>>> >>> fastDummies package along with a bit of dplyr:
>>> >>>
>>> >>> library(dplyr)
>>> >>> library(fastDummies)
>>> >>> library(robumeta)
>>> >>>
>>> >>> data("oswald2013")
>>> >>>
>>> >>> oswald_centered <-
>>> >>>    oswald2013 %>%
>>> >>>
>>> >>>    # make dummy variables
>>> >>>    mutate(
>>> >>>      Scoring = recode(Scoring, "Difference Score" = "Difference",
>>> >>> "Relative Rating" = "Relative")
>>> >>>    ) %>%
>>> >>>    dummy_columns(select_columns = "Scoring") %>%
>>> >>>
>>> >>>    # centering by study
>>> >>>    group_by(Study) %>%
>>> >>>    mutate_at(vars(starts_with("Scoring_")),
>>> >>>              list(wthn = ~ . - mean(.), btw = ~ mean(.))) %>%
>>> >>>
>>> >>>    # calculate Fisher Z and variance
>>> >>>    mutate(
>>> >>>      Z = atanh(R),
>>> >>>      V = 1 / (N - 3)
>>> >>>    )
>>> >>>
>>> >>>
>>> >>> # Use the predictors in a meta-regression model
>>> >>> # with Scoring = Absolute as the omitted category
>>> >>>
>>> >>> robu(Z ~ Scoring_Difference_wthn + Scoring_Relative_wthn +
>>> >>> Scoring_Difference_btw + Scoring_Relative_btw, data =
>>> oswald_centered,
>>> >>> studynum = Study, var.eff.size = V)
>>> >>>
>>> >>>
>>> >>> Kind Regards,
>>> >>> James
>>> >>>
>>> >>> On Tue, Jun 2, 2020 at 6:49 PM Simon Harmel <sim.harmel using gmail.com>
>>> wrote:
>>> >>>
>>> >>>> Hi All,
>>> >>>>
>>> >>>> Page 13 of *THIS ARTICLE
>>> >>>> <
>>> >>>>
>>> https://cran.r-project.org/web/packages/robumeta/vignettes/robumetaVignette.pdf
>>> >>>>> *
>>> >>>>   (*top of the page*) recommends that if a *continuous moderator
>>> *varies
>>> >>>> both within and across studies in a meta-analysis, a strategy is to
>>> break
>>> >>>> that moderator down into two moderators by:
>>> >>>>
>>> >>>> *(a)* taking the mean of each study (between-cluster effect),
>>> >>>>
>>> >>>> *(b)* centering the predictor within each study (within-cluster
>>> effect).
>>> >>>>
>>> >>>> BUT what if my original moderator that varies both within and across
>>> >>>> studies is a *"categorical" *moderator?
>>> >>>>
>>> >>>> I appreciate an R demonstration of the strategy recommended.
>>> >>>> Thanks,
>>> >>>> Simon
>>> >>>>
>>> >>>>          [[alternative HTML version deleted]]
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> R-sig-meta-analysis mailing list
>>> >>>> R-sig-meta-analysis using r-project.org
>>> >>>> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>> >>>>
>>> >       [[alternative HTML version deleted]]
>>> >
>>> > _______________________________________________
>>> > R-sig-meta-analysis mailing list
>>> > R-sig-meta-analysis using r-project.org
>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>>
>>> --
>>>
>>> Dr. rer. nat. Gerta Rücker, Dipl.-Math.
>>>
>>> Institute of Medical Biometry and Statistics,
>>> Faculty of Medicine and Medical Center - University of Freiburg
>>>
>>> Stefan-Meier-Str. 26, D-79104 Freiburg, Germany
>>>
>>> Phone:    +49/761/203-6673
>>> Fax:      +49/761/203-6680
>>> Mail:     ruecker using imbi.uni-freiburg.de
>>> Homepage: https://www.uniklinik-freiburg.de/imbi.html
>>>
>>>

	[[alternative HTML version deleted]]



More information about the R-sig-meta-analysis mailing list