[R-meta] Multivariate data: RVE imputing covariance matrices

Fri Apr 9 11:35:36 CEST 2021

Dear all

We are conducting a meta-analysis of observational studies in which several
studies have mulitple effect sizes.

*Description of our data*

So that you have a reproducible example, I have adapted the "SATcoaching"
data from the "clubSandwich" to mimic our situation (the code is provided
below).

In our meta-analysis, we collected data about 60 studies (~100 effect
sizes) that used Pearson's correlation to measure the effect size of the
association between two numeric constructs. Our "independent variable"
could be measured with 4 different methods and our "outcome" could be
measured with 8 different  methods.

Some studies reported multiple effect sizes. This comes from the fact (i)
studies included multiple measures of the outcome or (ii) studies included
multiple measures of the independent variable (no study combined multiple
measures of the independent variable with multiple measures of the
outcome).

Unfortunately, we do not have the correlation matrix between the
different measures
of the  outcome/independent variable.

*Objective of our meta-analysis*

We are interested in generating:

1) a pooled effect size across measures

2) a pooled effect size for each combination of outcome and independent
variable measures

*Approach 1 (standard RVE)*

Initially, our approach was to use a robust variance estimator using the
robumeta package to take into account the dependency between our effect
sizes.

#main analysis:

robumeta::robu(z ~ 1,
               data =  SATcoaching_order  ,
               studynum = study,
               var.eff.size = Vz,
               modelweights = "CORR",
               small = TRUE)

#For the moderation assessing the pooled effect size for each combination
of outcome/independent variable measure, we created a variable combining
all observed combination of outcome measures and independent variable
measures (labelled "out_var")

robumeta::robu(z ~  -1 + out_var,
               data = SATcoaching_order,
               studynum = study,
               var.eff.size = Vz,
               modelweights = "CORR",
               small = TRUE)

*Approach 2 (CSE)*

I have recently read with a lot of interest the preprint of Profs
Pustejovsky and Tipton (entitled "Meta-Analysis with Robust Variance
Estimation: Expanding the Range of Working Models").

As they described, we assume that we have a within-study heterogeneity in
true effect size according to the outcome/independent variable measures
(empirically, we found that some combination have a very low inconsistency
~20% while others have a very high 90%)

#This is how I have implemented this approach

V_list  <- impute_covariance_matrix(vi = SATcoaching_order$Vz,

                                    cluster = SATcoaching_order$study,

                                    r = 0.7,

                                    return_list = FALSE,

                                    smooth_vi = TRUE,

                                    subgroup = SATcoaching_order$out_var)

#primary analysis with all outcome/independent variable measures combined)

res <- metafor::rma.mv(yi = z,

                V = V_list,

                data = SATcoaching_order,

                random = list(~ out_var | study),

                struct = "DIAG",

                sparse = TRUE)

coef_test(res, vcov = "CR2")

#moderation analysis with all combination possible

resmod <- metafor::rma.mv(yi = z,

                V = V_list,

                mods = ~ out_var - 1,

                data = SATcoaching_order,

                random = list(~ out_var | study),

                struct = "DIAG",

                sparse = TRUE)

coef_test(resmod, vcov = "CR2")

#In moderation model, some of the moderation modalities produced NaN when
requesting robust SE (but not model-based SE).

*Approach 3 (CHE)*

Because the previous model did not estimate all modalities using robust
standard errors, we have tried a simpler model for a sensitivity analysis

 V_list2  <- impute_covariance_matrix(vi = SATcoaching_order$Vz,

                                    cluster = SATcoaching_order$study,

                                    r = 0.7,

                                    return_list = FALSE,

                                    smooth_vi = TRUE)

                                    ###subgroup = SATcoaching_order$out_var
)###

resmod2 <- metafor::rma.mv(yi = z,

                V = V_list2,

                mods = ~ out_var - 1,

                data = SATcoaching_order,

                random = ~ 1 | study / esid,

                sparse = TRUE)

coef_test(resmod2, vcov = "CR2")

#Using this simpler approaches, everything is estimated.

*Question 1.*

Could you confirm that the robust variance estimation is appropriate for
our data (given that dependence between effect sizes are produced not only
by the presence of several outcomes, but also by the presence of several
independent variable measures)?

*Question 2.*

Is there an approach that should be absolutely privileged (we tend to
believe that the CSE approach would be the most suitable) and is the
implantation of the various models employing an appropriate syntax?

*Question 3.*

Is it correct to anticipate a within-study heterogeneity in true effect
size according to the measure of outcome/independent variable while most of
the studies (70%) used only one combination of outcome and independent
variable measure?

Best regards

BF

######reproductible example

 library(clubSandwich)

data(SATcoaching) # load data

names(SATcoaching)[names(SATcoaching) == "d"] <- "z" # we used z as effect
size

SATcoaching$Vz <- 1 / ((SATcoaching$nT + SATcoaching$nC) - 3) # variance of
our effect size

names(SATcoaching)[names(SATcoaching) == "test"] <- "outcome"

#here, we set that some studies have used a unique outcome (to let the
measure of the independent variable vary)

SATcoaching$outcome[12] <- "Math";

SATcoaching$outcome[14] <- "Math";

SATcoaching$outcome[17] <- "Math";

SATcoaching$outcome[19] <- "Math";

SATcoaching$outcome[33] <- "Math";

SATcoaching$outcome[67] <- "Math";

# we create a new variable which described that the independent variable
has been measured using 3 different tools (A B or C)

SATcoaching$measure_ind_variable <- "A";

SATcoaching$measure_ind_variable[c(5,6)] <- c("B","C");

SATcoaching$measure_ind_variable[c(12,13)] <- c("B","C");

SATcoaching$measure_ind_variable[c(14,15)] <- c("B","C");

SATcoaching$measure_ind_variable[c(16,17)] <- c("A","C");

SATcoaching$measure_ind_variable[c(18,19)] <- c("A","C");

SATcoaching$measure_ind_variable[c(32,33)] <- c("A","C");

SATcoaching$measure_ind_variable[c(66,67)] <- c("A","C");

# we reorder the dataframe

SATcoaching_order <- SATcoaching[order(SATcoaching$study),]

 SATcoaching_order$esid <- 1:nrow(SATcoaching_order)

#Then, we created a variable combining – for each study – the outcome and
the measure of independent variable:

SATcoaching_order$out_var <- with(SATcoaching_order,

paste0(outcome, "_", measure_ind_variable))

	[[alternative HTML version deleted]]