[R-meta] Specifying the random effect structure of our multilevel meta-analysis in metafor
James Pustejovsky
jepu@to @end|ng |rom gm@||@com
Mon Jan 23 16:07:14 CET 2023
Hi Francisco,
Your question is ultimately an empirical one because the best model is one
that captures the actual sources of variation in effects, so I don't think
you'll find a definitive answer. I'll offer a couple of observations.
First, considering that your research aims seem to focus on variation
across VPGs, I think that this suggests prioritizing modeling the variation
across VPGs rather than treating it solely as a fixed factor (as in your
first model, where you have random effects for arms nested within studies
but not for VPGs).
Second, it sounds like your data structure consist of multiple studies,
each with two factors: voting propensity group (VPG) and experimental arm.
Thus, in principle you could have a model in which effects for each unique
combination of VPG and arm vary and co-vary across studies, as in:
df$arm.by.vpg <- paste0(df$arm, df$voting_prop)
Model A: random = ~ arm.by.vpg | experiment, struct = "UN"
This is assuming that the arm labels (1, 2, 3 in your data) have the same
meaning across studies. Further, this model might be quite difficult to fit
if there are a lot of unique arms (e.g., 3 arms, 3 VPG categories leads to
a 9-dimensional multivariate effect for each study).
So, perhaps you could simplify the model by imposing some further
structure. For instance, you could treat the arms and VPGs as following an
additive compound-symmetric structure:
df$vpg.in.experiment <- paste0(df$experiment, df$voting_prop)
Model B: random = list(~ 1 | experiment, ~ 1 | vpg.in.experiment, ~ 1 |
arm.in.experiment)
or equivalent:
Model B: random = list(~ 1 | experiment / voting_prop, ~ 1 |
arm.in.experiment)
Here, you've got study-level random effects, within-study effects for each
VPG, and within-study effects for each arm. In principle you could also
include the interaction between VPG and arm in study:
Model C: random = list(~ 1 | experiment / voting_prop / arm, ~ 1 |
arm.in.experiment)
The assumptions I would be concerned with in this model are that it treats
the within-study variation as homogeneous for each VPG and homogeneous
across studies. This might not be ideal if, for instance, the
study-to-study variation in effects is minimal for the low VPG but
substantial for the high VPG. Loosening this assumption can be done with
something like
Model D: random = list(~ VPG | experiment, ~ 1 | arm.in.experiment), struct
= "UN"
which is a compromise between the full multivariate model (Model A) and the
compound symmetry model (Model B).
Third, in the example data you sent, each study includes low-, medium-, and
high-VPG groups. If that is true in your real data, then all's well. If
not---as in, if some studies include only high-VPG and other studies
include only low and medium VPG---then you might also want to consider
centering the voting_prop predictor within study. Specifically, you'd
create dummy variables for each category and then center each dummy
variable by study. Centering in this way means that the differences between
VPG levels will be estimated using the direct, head-to-head comparison of
effects within each study. Without centering, the differences between VPG
levels will be estimated based in part on cross-study differences, which
are at greater risk of confounding due to factors (methodological or
otherwise) that vary from study to study. See Tanner-Smith & Tipton (2014;
https://doi.org/10.1002/jrsm.1091) for further discussion.
Fourth, and predictably for me, this is another good instance to consider
using robust variance estimation (RVE) methods for inference. You've got a
complicated data structure and potential concern that the random effects
specification could be mis-specified in some way (such as by failing to
capture a relevant source of study-to-study variation). RVE is essentially
an insurance policy against model misspeciification. Even if you don't
exactly nail the random effects down perfectly, you'll still have
protection for inference about the fixed effects of voting_prop levels. See
Pustejovsky & Tipton (2022; https://doi.org/10.1007/s11121-021-01246-3) for
details.
James
On Mon, Jan 23, 2023 at 5:05 AM Tomas-Valiente Jorda Francisco <
tomasf using student.ethz.ch> wrote:
> Hello,
>
> I am part of a team, with Prof Peter John at King’s College London and Dr
> Florian Foos and Ceren Cinar at the London School of Economics, that is
> working on a meta-analysis of get-out-the-vote (GOTV) interventions. Our
> project looks at whether GOTV’s effect is larger on people with low vs high
> voting propensity. We had a question about how to specify the random
> effect’s structure of our model.
>
> Our setup is as follows. We have estimates for the ITT of many GOTV
> interventions, computed comparing turnout between individuals randomized to
> some treatment group and a control group. Importantly, for each treatment,
> we have effect estimates separately for individuals with high vs medium vs
> low voting propensities. Many experiments have multiple treatment arms
> (which often differ on the particular get-out-the-vote message used), so
> effectively we have effect estimates for each voting
> propensity-experiment-arm triplet, with arms nested within experiments.
> Since we have subjects of each voting propensity on (almost) all arms, for
> any experiment with (say) two treatment arms, we have an estimate of
> control vs treatment 1 and control vs treatment 2 for each voting
> propensity separately. Using simulated data, our dataset has the following
> structure:
>
> df <- structure(list(effect = c(0.155803932130353, 0.172093700115289,
> 0.228564716936809, -0.0029411764705886, 0.226006191950464,
> 0.24365585512081, 0.205042523718298, 0.529411764705883, 0.418300653594771,
> 0.333333333333333, 0.357142857142857, 0.333333333333333, 0.227386316387913,
> 0.110813804839315, -0.0158629160414587, 0.362962962962963,
> 0.266666666666667, 0.127272727272728), arm = structure(c(1L, 1L, 1L, 1L,
> 2L, 1L, 2L, 1L, 2L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("1",
> "2", "3"), class = "factor"), experiment = c("A", "A", "A", "B", "B", "B",
> "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", "C", "C"), voting_prop =
> c("low", "medium", "high", "low", "low", "medium", "medium", "high",
> "high", "low", "low", "low", "medium", "medium", "medium", "high", "high",
> "high")), row.names = c(NA, -18L), class = "data.frame")
>
> Sampling errors are not independent across arms within voting
> propensity-experiment pairs, since the same control group (which is
> specific to the voting propensity-experiment pair) is used to estimate the
> ITT of different treatment arms. We use the Gleser & Olkin (2009) method to
> estimate the variance-covariance matrix of sampling errors, which for the
> example above is:
>
> V <- structure(c(0.00527759653574149, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0.000272246907504789, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0012451834193348, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0268238335114407,
> 0.0144993694656436, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0.0144993694656436, 0.0274724895138511, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.000219964986797141,
> 0.000110094264225202, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0.000110094264225202, 0.000220356782547541, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0227385071675729,
> 0.0084216693213233, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0.0084216693213233, 0.0163754681247953, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0523485584710074,
> 0.0294460641399417,
> 0.0294460641399417, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0.0294460641399417, 0.0441690962099125, 0.0294460641399417,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0294460641399417,
> 0.0294460641399417, 0.0523485584710074, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.000477394970026329, 0.00023697669984828,
> 0.000236976699848284, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0.00023697669984828, 0.000475661779613111, 0.000236976699848285,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.000236976699848284,
> 0.000236976699848285, 0.000476154421676615, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0147396599742279,
> 0.00947549569771795,
> 0.00947549569771793, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0.00947549569771795, 0.0189509913954359, 0.00947549569771794,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.00947549569771793,
> 0.00947549569771794, 0.015936060946162), .Dim = c(18L, 18L))
>
> We are interested in testing whether the effect of get-out-the-vote
> interventions differs across voting propensities. We are planning to use
> the metafor package. But we are not sure of what would be a reasonable way
> to specify the random effect structure of the model to properly capture how
> true effects are correlated. We think the best approach is to estimate the
> following multilevel meta-analysis model that captures that arms are nested
> within experiments, such that true effects of different arms in the same
> experiment may resemble (e.g. because they were tested in the same election
> and country):
>
> rma.mv(effect, random = list(~ 1 | experiment/arm), data = df, V = V,
> mods = ~ voting_prop)
>
> Does the model above look reasonable? Or do we also need to allow true
> effects to be correlated within voting propensity groups across arms? We
> think this is not necessary but are not sure. If it is, we are not really
> sure about which of the approaches below is the best way to specify this.
> Note voting propensities are not really nested within experiment-arms
> (effectively this is a cross-classified structure).
>
> rma.mv(effect, random = list(~ 1 | experiment/arm/voting_prop), data =
> df, V = V, mods = ~ voting_prop)
>
> df$arm.in.experiment <- paste0(df$experiment, df$arm)
> rma.mv(effect, random = list(~ 1 | experiment, ~ arm.in.experiment |
> voting_prop), data = df, V = V, mods = ~ voting_prop)
>
> Any guidance you could provide would be very much welcome.
>
> Francisco Tomás-Valiente Jordá
> ETH Zürich
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list