[R-meta] Questions about multilevel meta-analysis structure
Isaac Calvin Saywell
|@@@c@@@ywe|| @end|ng |rom @de|@|de@edu@@u
Tue Oct 24 05:09:42 CEST 2023
Hi Reza & James,
Thank you for your help with my previous questions. I am working on another multilevel meta-analysis with a colleague and we were hoping you could help us with a few more questions. This is an individual participant data (IPD) meta-analysis, so we have access to the raw data. We are exploring the relationship between cognitive reserve (CR) indicators (e.g., educational attainment, occupational complexity, pre-morbid IQ) and cognitive outcomes (e.g., working memory, long term memory, visual processing speed etc), among individuals previously infected with COVID-19. We are specifically exploring how CR and COVID-19 severity (e.g., time in hospital, need for oxygen therapy etc) interact to predict cognitive outcomes. We would expect to find a negative relationship between the severity of the disease and cognitive outcomes, which is less pronounced in people who have a higher CR (see hypothetical interaction: https://universityofadelaide.box.com/s/c0ehcj70m3765w99xbi6om61udos5czv).
[cid:f1ef5d4b-d91c-470e-9e00-ecac0878ec5d]
To assess this relationship we have performed regression models which take the following form:
lm(cognitive outcome ~ age + sex + cognitive reserve * COVID severity)
In a given study, there may be a range of cognitive outcomes, cognitive reserve indicators and COVID severity indicators that were assessed. In order to capture all this information we have repeated the linear model for every possible combination of these (ranging anywhere from a handful to 200 models per study). The size of the main effects and interaction effect were then extracted in the form of semipartial correlations<https://journals.sagepub.com/doi/10.3102/1076998610396901>, which are the effect sizes of interest for the meta-analyses. (A semipartial correlation is essentially a standard correlation which has been corrected for potential confounds such as age and sex). Across 30 studies we have derived approximately 1000 effect sizes for each of the main effects, and the interaction term.
We have then run three multilevel meta-analyses (for the main effect of CR, main effect of COVID severity, and the interaction effect) with the following code:
V = impute_covariance_matrix(vi = df$vi, cluster = df$study_id, r = 0.6)
model <- rma.mv(yi,
V,
random = list(~ 1 | study_id/effectsize_id),
tdist = TRUE,
method = "REML",
sparse = TRUE,
data = data_temp)
and performed a series of moderation analyses which assess the influence of (1) cognitive domain, (2) CR indicator and (3) severity indicator on the results.
moderation <- rma.mv(yi,
V,
mods = ~ moderator,
random = list(~ moderator | study_id, ~1 | effectsize_id),
struct ="HCS",
tdist = TRUE,
method = "REML",
sparse = TRUE,
control = list(rel.tol=1e-8),
data = data_temp_2)
As we have been conducting these analyses, several questions have presented themselves.
(1) Plotting the moderating influence of CR indicator (CRQ scores, Education level, Education years etc) on the main effect of CR resulted in the following: https://universityofadelaide.box.com/s/7r3km1oxdq8l0k8v42o3y7jtblq2ctel
[cid:bcc2e594-c487-495d-8390-90c40103423b]
As you can see, the estimated central tendency for CRQ scores is very inflated relative to the individual effect sizes (it is also somewhat inflated for the other CR indicators). We are assuming the estimate for CRQ scores is inflated because effect sizes were derived from only one study? If this is correct, should it be dropped from the moderation analysis? If not, do you have any ideas as to why this occurred? (Several of our other moderation analyses are showing the same problem, though not to the same extent as this one).
(2) In regards to the calculations used to compute V via the impute_covariance_matrix() function, would it be preferential to manually calculate this from the raw data? If so, are you able to provide resources on how to do this?
(3) Finally, given the unique nature of our meta-analysis and the large number of effect sizes that are derived from often very similar regression models, are there any problems you can see regarding our use of these functions?
Any help with these questions would be greatly appreciated!
Kind regards,
Isaac Saywell
________________________________
From: Reza Norouzian <rnorouzian using gmail.com>
Sent: Monday, 24 July 2023 12:36 PM
To: Isaac Calvin Saywell <isaac.saywell using adelaide.edu.au>
Cc: R Special Interest Group for Meta-Analysis <r-sig-meta-analysis using r-project.org>; jepusto using gmail.com <jepusto using gmail.com>
Subject: Re: [R-meta] Questions about multilevel meta-analysis structure
CAUTION: External email. Only click on links or open attachments from trusted senders.
Isaac,
You don't need the "subgroup = dat$cog_domain" part in your
impute_covariance_matrix() call.
V_sub would be necessary, if you were doing a subgroup analysis using
a single model such as Model 2.1 and Model 2.2 where your cog_domain
could vary in the studies.
Once you decide (e.g., based on better model fit relative to other
candidate models) to adopt a multivariate model (letting true effects
for the cognitive domains to have a joint distribution in the
studies), then V_sub which serves to make the sampling errors
associated with the cognitive domains independent in each study
becomes irrelevant.
Please also take a look at the archives as I believe you can find
multiple useful posts discussing several other relevant issues such as
using cluster robust inferences (p values, CIs) associated with the
average effects of your cognitive domains in your current output.
Alternatively, you can check out metafor's help page related to this
issue by doing: ?robust.rma.mv
Kind regards,
Reza
On Sun, Jul 23, 2023 at 8:59 PM Isaac Calvin Saywell
<isaac.saywell using adelaide.edu.au> wrote:
>
> Hi James and Reza,
>
> Thank you both for your detailed responses, they have provided more clarity on multilevel modelling and cleared up any possible misunderstandings I had.
>
> My team and I have decided, in line with both of your suggestions, that "HCS" is the most appropriate model variance structure for our data (given there are many studies that don't include effects for all cognitive domains).
>
> Only a couple of cognitive domains get pulled downward in the multivariate model, where most effect estimates remain quite accurate. When testing the models, you suggested for the equivalent of subgroup analyses the effect estimates for the cognitive domains that were pulled downward were much closer to the effects in univariate models.
>
> My last question then would be should we be specifying cognitive domain as our subgroup for when imputing a variance-covariance matrix for our multilevel moderator model or is this not appropriate? Therefore, would the following code be suitable?
>
> ## V <- impute_covariance_matrix(vi = dat$variance, cluster = dat$study_id, r = 0.6, subgroup = dat$cog_domain)
> ##
> ## res <- rma.mv(yi,
>
> V,
> mods = ~ cog_domain,
> random = list(~ cog_domain | study_id, ~ 1 | unique_id),
> struct = "HCS",
> tdist = TRUE,
> method = "REML",
> data = dat)
>
>
> Thank you to both of you again for sharing your expertise, it has been highly appreciated.
>
> Kind regards,
>
> Isaac
> ________________________________
> From: R-sig-meta-analysis <r-sig-meta-analysis-bounces using r-project.org> on behalf of Reza Norouzian via R-sig-meta-analysis <r-sig-meta-analysis using r-project.org>
> Sent: Friday, 21 July 2023 12:36 AM
> To: R Special Interest Group for Meta-Analysis <r-sig-meta-analysis using r-project.org>
> Cc: Reza Norouzian <rnorouzian using gmail.com>
> Subject: Re: [R-meta] Questions about multilevel meta-analysis structure
>
> CAUTION: External email. Only click on links or open attachments from trusted senders.
>
> James' responses are right on. I typed this up a bit ago so instead of
> dumping them I put them here in case they might be helpful.
>
> In general, modeling effect sizes may often depend at least on a couple of
> things. First, what are study goals/objectives? For example, would that be
> one of your study goals/objectives to understand the extent of
> relationships that exists among the true effects associated with your 9
> different cognitive domains? Does such an understanding help you back an
> existing theoretical/practical view up or bring up a new one to the fore?
>
> If yes, then potentially one of “~inner | outer” type formulas in your
> model could to some extent help.
>
> Second, do you have empirical support to achieve your study goal? This one
> essentially explains why I hedged a bit (‘potentially’, ‘one of’, ‘to some
> extent’) toward the end when describing the first goal above. Typically,
> the structure of the data that you have collected could determine which (if
> any) of the available random-effects structures can lend empirical support
> to your initial goal.
>
> Some of these structures like UN allow you to tap into all the existing
> bivariate relationships between your 9 different cognitive domains. But
> that comes with a requirement. Those 9 cognitive domains must have
> co-occurred in a good number of the studies you have included in your
> meta-analysis. To the extent that this is not the case, you may need to
> simplify your random-effects structures using alternatively available
> structures (CS, HCS etc.).
>
> Responses to your questions are in-line below.
>
> 1. Is my model correctly structured to account for dependency using the
> inner | outer formula (see MODEL 1 CODE below) or should I just specify
> random effects at the study and unique effect size level (see MODEL 2 CODE
> below).
>
> Please see my introductory explanation above. But please also note that
> “struct=” only works with formulas that are of the form “~inner | outer”
> where inner is something other than intercept (other than ~1). Thus, UN
> is entirely ignored in model 2.
>
> 2. If I do need to specify an inner | outer formula to compare effect sizes
> across cognitive domains, then is an unstructured variance-covariance
> matrix ("UN") most appropriate (allowing tau^2 to differ among cognitive
> domains) or should another structure be specified?
>
> Please see my introductory explanation above.
>
> 3. To account for effect size dependency is a variance-covariance matrix
> necessary (this is what my model currently uses) or is it ok to use
> sampling variance of each in the multilevel model.
>
> I’m assuming you’re referring to V. You’re not currently showing the
> structure of V. See also James' response.
>
> 4. When subsetting my data by one cognitive domain and investigating this
> same cognitive domain in a univariate multilevel model the effect estimate
> tends to be lower compared to when all cognitive domains are included in a
> single multilevel model as a moderator, is there a reason for this?
>
> See James’ answer.
>
>
> On Thu, Jul 20, 2023 at 9:53 AM James Pustejovsky via R-sig-meta-analysis <
> r-sig-meta-analysis using r-project.org> wrote:
>
> > Hi Isaac,
> >
> > Comments inline below. (You've hit on something I'm interested in, so
> > apologies in advance!)
> >
> > James
> >
> > On Thu, Jul 20, 2023 at 12:17 AM Isaac Calvin Saywell via
> > R-sig-meta-analysis <r-sig-meta-analysis using r-project.org> wrote:
> >
> > >
> > > 1. Is my model correctly structured to account for dependency using the
> > > inner | outer formula (see MODEL 1 CODE below) or should I just specify
> > > random effects at the study and unique effect size level (see MODEL 2
> > CODE
> > > below).
> > >
> > >
> > The syntax looks correct to me except for two things. First, the first
> > argument of each model should presumably be yi = yi rather than vi. Second,
> > in Model 2, the struct argument is not necessary and will be ignored (it's
> > only relevant for models where the random effects have inner | outer
> > structure).
> >
> > Conceptually, this is an interesting question. Model 1 is theoretically
> > appealing because it uses a more flexible, general structure than Model 2.
> > Model 1 is saying that there are different average effects for each
> > cognitive domain, and each study has a unique set of effects per cognitive
> > domain that are distinct from each other but can be inter-correlated. In
> > contrast, Model 2 is saying that the study-level random effects apply
> > equally to all cognitive domains---if study X has higher-than-average
> > effects in domain A, then it will have effects in domain B that are equally
> > higher-than-average.
> >
> > The big caveat with Model 2 is that it can be hard to fit unless you have
> > lots of studies, and specifically lots of studies that report effects for
> > multiple cognitive domains. To figure out if it is feasible to estimate
> > this model, it can be useful to do some descriptives where you count the
> > number of studies that include effect sizes from each possible *pair* of
> > cognitive domains. If some pairs have very few studies, then it's going to
> > be difficult or impossible to fit the multivariate random effects structure
> > without imposing further restrictions.
> >
> > In case it's looking infeasible, there are some other random effects
> > structures that are intermediate between Model 1 and Model 2, which might
> > be worth trying:
> > Model 1.0: random = list(~ cog_domain | study_id, ~ 1 | effectsize_id),
> > struct = "UN"
> > Model 1.1: random = list(~ cog_domain | study_id, ~ 1 | effectsize_id),
> > struct = "HCS"
> > Model 1.2: random = list(~ cog_domain | study_id, ~ 1 | effectsize_id),
> > struct = "CS"
> > Model 1.2 (equivalent specification, I think): random = ~ 1 | study_id /
> > cog_domain / effectsize_id
> > Model 2.0: random = list(~ 1 | study_id, ~ 1 | effectsize_id)
> > Model 2.0 (equivalent specification): random = ~ 1 | study_id /
> > effectsize_id
> >
> > So perhaps there is something in between 1.0 and 2.0 that will strike a
> > balance between theoretical appeal and feasibility.
> >
> >
> > > 2. If I do need to specify an inner | outer formula to compare effect
> > > sizes across cognitive domains, then is an unstructured
> > variance-covariance
> > > matrix ("UN") most appropriate (allowing tau^2 to differ among cognitive
> > > domains) or should another structure be specified?
> > >
> > > See previous response.
> >
> >
> > > 3. To account for effect size dependency is a variance-covariance matrix
> > > necessary (this is what my model currently uses) or is it ok to use
> > > sampling variance of each in the multilevel model.
> > >
> >
> > This has been discussed previously on the listserv. My perspective is that
> > you should use whatever assumptions are most plausible. If you expect that
> > there really is correlation in the sampling errors (e.g., because the
> > effect size estimates are based on correlated outcomes measured on the same
> > set of respondents), then I think it is more defensible to use a
> > non-diagonal V matrix, as in your current syntax.
> >
> >
> > >
> > > 4. When subsetting my data by one cognitive domain and investigating this
> > > same cognitive domain in a univariate multilevel model the effect
> > estimate
> > > tends to be lower compared to when all cognitive domains are included in
> > a
> > > single multilevel model as a moderator, is there a reason for this?
> > >
> >
> > Is this true for *all* of the cognitive domains or only one or a few of
> > them? Your Model 1 and Model 2 use random effects models that assume effect
> > sizes from different cognitive domains are somewhat related (i.e., the
> > random effects are correlated within study) and so the average effect for a
> > given domain will be estimated based in part on the effect size estimates
> > for that domain and in part by "borrowing information" from other domains
> > that are correlated with it. Broadly speaking, the consequence of this
> > borrowing of information is that the average effects will tend to be pulled
> > toward each other, and thus will be a little less dispersed than if you
> > estimate effects through subgroup analysis.
> >
> > The above would explain why some domains would get pulled downward in the
> > multivariate model compared to the univariate model, but it would not
> > explain why *all* of the domains are pulled down. If it's really all of
> > them, then I suspect your data must have some sort of association between
> > average effect size and the number of effect size estimates per study.
> > That'd be weird and I'm not really sure how to interpret it. You could
> > check on this by calculating a variable (call it k_j) that is the number of
> > effect size estimates reported per study (across any cognitive domain) and
> > then including that variable as a predictor in Model 1 or Model 2 above.
> > This would at least tell you if there's something funky going on...
> >
> > As a bit of an aside, you can do the equivalent of a subgroup analysis
> > within the framework of a multivariate working model, which might be
> > another thing to explore to figure out what's going on. To do this, you'll
> > first need to recalculate your V matrix, setting the subgroup argument to
> > be equal to cog_domain. This amounts to making the assumption that there is
> > correlation between effect size estimates *within* the same domain but not
> > between domains of a given study. Call this new V matrix V_sub. Then try
> > the following model specifications:
> >
> > Model 2.1: V = V_sub, random = list(~ cog_domain | study_id, ~ cog_domain |
> > effectsize_id), struct = c("DIAG","DIAG")
> > Model 2.2: V = V_sub, random = list(~ cog_domain | study_id, ~ 1 |
> > effectsize_id), struct = "DIAG",
> >
> > Model 2.1 should reproduce what you get from running separate models by
> > subgroup.
> > Model 2.2 is a slight tweak on that, which assumes that there is a common
> > within-study, within-subgroup variance instead of allowing this to differ
> > by subgroup. Model 2.2 is nested in Models 1.0 and 1.1, but not in 1.2.
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> > To manage your subscription to this mailing list, go to:
> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis<https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis>
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis<https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis>
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis
mailing list