[R-meta] Interpretation of the Q-test statistic in a multilevel meta-analysis

Wed Sep 11 12:51:42 CEST 2024

Dear Martin,

First of all: Over 8,000 effect sizes?!? Wow, you might be breaking some kind of record there.

A sidenote: Given the model below, I would suspect that 'sparse=TRUE' would help to speed up model fitting.

Now for your actual question: No, the Q-test does not test for "between-clusters variation" (at least not in the sense that it tests for variation between the units of the highest level in the multilevel structure, which seems to be what the reviewer is implying). The docs, which you read (thanks!), correct spell out what the Q-test is testing. In essence, it is testing the given model against one without any random effects. In your case, this would be:

M1 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = ~ 1 | COUNTRY / SampleID / ESID)
M0 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat)
anova(M0, M1)

except that this will give you a likelihood ratio test of the random effects, while the Q-test is comparing M0 against a model where every effect size is allowed to have its own fixed effect. So the test statistics are not the same, but conceptually, the two approaches are comparable.

If you want to test for between-country variation, then one can do a LRT comparing model M1 above against one where the country-level variance component is constrained to 0:

M0a <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = ~ 1 | COUNTRY / SampleID / ESID, sigma2=c(0,NA,NA))
anova(M0a, M1)

Model M0a assumes that there is no between-country variation, but it does allow for between-sample (within country) variation and between-effect-size (within sample) variation. So this is quite different than what the Q-test does (and hence the comparison between M0 and M1).

I hope this clarifies things.

Best,
Wolfgang

> -----Original Message-----
> From: R-sig-meta-analysis <r-sig-meta-analysis-bounces using r-project.org> On Behalf
> Of Martin Brunner via R-sig-meta-analysis
> Sent: Wednesday, September 11, 2024 10:23
> To: r-sig-meta-analysis using r-project.org
> Cc: Martin Brunner <martin.brunner using uni-potsdam.de>
> Subject: [R-meta] Interpretation of the Q-test statistic in a multilevel meta-
> analysis
>
> Dear List Members,
> We employed the rma.mv function from the metafor package to perform a
> meta-analysis where effect sizes were nested within samples, and samples
> were nested within countries. The total number of effect sizes exceeded
> 8,000. Below, I provide a toy example, in which I randomly sampled 626
> effect sizes from 351 samples across 87 countries.
> We specified a variance-covariance matrix (vcov_mat) to account for the
> observed effect sizes within each sample. The corresponding code was as
> follows:
>
> M1 <- rma.mv(yi = Corrz, V = vcov_mat, data = tmp_es_dat, random = list(~ 1
> | COUNTRY / SampleID / ESID), sparse = FALSE)
>
> Here are the results:
> Multivariate Meta-Analysis Model (k = 626; method: REML)
>
>      logLik    Deviance         AIC         BIC        AICc
>    728.1443  -1456.2886  -1448.2886  -1430.5376  -1448.2241
>
> Variance Components:
>
>              estim    sqrt  nlvls  fixed                 factor
> sigma^2.1  0.0042  0.0648     87     no                COUNTRY
> sigma^2.2  0.0037  0.0610    351     no       COUNTRY/SampleID
> sigma^2.3  0.0021  0.0459    626     no  COUNTRY/SampleID/ESID
>
> Test for Heterogeneity:
> Q(df = 625) = 23584.2025, p-val < .0001
>
> Model Results:
>
> estimate      se      zval    pval    ci.lb    ci.ub
>   -0.2620  0.0085  -30.7263  <.0001  -0.2788  -0.2453  ***
>
> In addition to I² and the variance components at various levels (effect
> sizes, samples, and countries), we used the Q-test statistic to assess the
> heterogeneity of effect sizes.
> An expert reviewer of our meta-analysis pointed out potential ambiguities in
> how we interpreted the Q-test statistic. Specifically, the reviewer said
> that the Q-test statistic is "the test of the between-clusters variation
> (whatever the clusters are in the model)."
> However, I am unsure how to apply this interpretation to the Q-test
> statistic included in the metafor output. I learned from the help section of
> the rma.mv function that the Q "is the generalized/weighted least squares
> extension of Cochran's Q-test, which tests whether the variability in the
> observed effect sizes or outcomes is larger than one would expect based on
> sampling variability (and the given covariances among the sampling errors)
> alone. A significant test suggests that the true effects/outcomes are
> heterogeneous."
> In our case, the Q suggests that the observed effect sizes vary
> significantly (p < .0001) around the average effect size (r = -0.26).
> Furthermore, the Q provided by metafor points to statistically significant
> heterogeneity, with heterogeneity referring to the total variance
> encompassing all potential sources of variance, including effect sizes,
> samples, and countries. However, I am unsure whether this is what the
> reviewer meant by interpreting the Q as "between-clusters variation."
> I would highly appreciate any help in clarifying the interpretation of the
> Q-test statistic.
> Thank you!
> Best regards,
> Martin
>
> PS: I apologize for the poor formatting of the metafor output, but my email
> program does not support better formatting options.