[R-meta] Non-independent effect sizes for moderator analysis in meta-analysis on odds ratios

Tue Jun 13 08:58:09 CEST 2023

Dear Lukas,

You are asking about an issue that has been discussed quite extensively on this mailing list, but let me repeat some of the relevant points:

If two odds ratios come from the same sample, then they are not independent. Ignoring this dependency doesn't make your results "biased" (at least not in the sense of how bias is typically defined in statistics); the real issue is that the standard errors of the coefficients in the meta-regression model tend to be too small, leading to inflated Type I error rates and confidence intervals that are too narrow.

To deal with such dependency, one should ideally do several things:

1) Calculate the covariance between the dependent estimates. Just like we can compute the sampling variance of each log odds ratio, we can also compute their covariance. However, doing so if often tricky because the information needed to compute the covariance is typically not reported. Alternatively, one can compute an approximate covariance, making assumptions about the degree of dependency between them (e.g., if the two log odds ratios are assessing a treatment effect at two different timepoints, then they will tend to be more correlated if the two timepoints are closer to each other; or if the two log odds ratios are assessing a treatment effect for two different dichotomous response variables, then they will tend to be more correlated if two variables are also strongly correlated). One can use the vcalc() function to approximate the covariance making an assumption about the degree of correlation. Typically, when 'guestimating' the correlation, one also does a sensitivity analysis, assessing whether the conclusions remain unchanged when different degrees of correlation are assumed.

2) In addition to 1), one should try to account for the dependency that may arise in the underlying true effects (i.e., the true log odds ratios). This can be done via a multilevel/multivariate model. This is what you have done with rma.mv().

3) Finally, one can consider using cluster-robust inference methods (also known as robust variance estimation). However, with a small number of studies, this might not work so well. Alternatively, one can consider using bootstrapping (see https://doi.org/10.1002/jrsm.1554 and the wildmeta package).

See also:

https://wviechtb.github.io/metafor/reference/misc-recs.html#general-workflow-for-meta-analyses-involving-complex-dependency-structures

As for your last question: Not sure what exactly you mean by "the results". Based on the meta-regression model, you can compute predicted effects (log odds ratios), which you can back-transform to odds ratios. This can be done with predict(..., transf=exp).

Best,
Wolfgang

>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On
>Behalf Of Sotola, Lukas K [PSYCH] via R-sig-meta-analysis
>Sent: Monday, 12 June, 2023 17:01
>To: r-sig-meta-analysis using r-project.org
>Cc: Sotola, Lukas K [PSYCH]
>Subject: [R-meta] Non-independent effect sizes for moderator analysis in meta-
>analysis on odds ratios
>
>Dear all,
>
>I am trying to do a meta-analysis with odds ratios, but am running into one
>issue. I have a few effect sizes that are not independent. That is, there are a
>few samples in my data where we extracted two odds ratios for the purposes of
>performing a moderator analysis. Thus, while we have 6 samples (k = 6), we have 9
>effect sizes, as three samples we coded two odds ratios. Normally, when I run a
>meta-analysis with correlation coefficients, there is some way to indicate a
>"Study ID" variable of some sort, which will allow the analysis to distinguish
>effect sizes taken from the same sample for a moderator analysis. That way,
>multiple effect sizes from the same sample do not get counted multiple times.
>However, I cannot seem to find an analogous feature using metafor, and even when
>I run the meta-analysis with a moderator analysis, in the overall analysis (e.g.,
>for heterogeneity and such), the output indicates a "k" that is equal to the
>number of effect sizes and not equal to the real number of samples (i.e., k = 9).
>This makes me worry that the results I'm getting are biased.
>
>After using the "escalc" function to create a new dataset with odds ratios as
>shown on the metafor website, I have attempted the analysis itself in two ways so
>far. The name of my moderator variable is "HighACEDef" and it has two levels. For
>my first attempt at the meta-analysis, I use the following R code and get the
>output that follows it:
>
>Analysis1 <- rma(yi, vi, mods = ~factor(HighACEDef), data=dat1)
>Analysis1
>
>Mixed-Effects Model (k = 9; tau^2 estimator: REML)
>
>tau^2 (estimated amount of residual heterogeneity):     0.0252 (SE = 0.0203)
>tau (square root of estimated tau^2 value):             0.1587
>I^2 (residual heterogeneity / unaccounted variability): 82.50%
>H^2 (unaccounted variability / sampling variability):   5.71
>R^2 (amount of heterogeneity accounted for):            0.00%
>
>Test for Residual Heterogeneity:
>QE(df = 7) = 33.6446, p-val < .0001
>
>Test of Moderators (coefficient 2):
>QM(df = 1) = 0.7218, p-val = 0.3955
>
>Model Results:
>
>                       estimate      se    zval    pval    ci.lb   ci.ub
>intrcpt                  0.8144  0.0853  9.5439  <.0001   0.6472  0.9817  ***
>factor(HighACEDef)>=4    0.1151  0.1355  0.8496  0.3955  -0.1505  0.3807
>
>---
>Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
>The second way I have tried it is with this code and I get the output that
>follows it. "ES_ID" is a unique identification number for each separate effect
>size, whereas "ID" is a unique identification number for each sample we coded
>from. Thus, "ES_ID" has the values of 1-9 while "ID" has the values of 1-6.
>
>Analysis2 <- rma.mv(yi, vi, random = ~ ES_ID | ID, mods = ~factor(HighACEDef),
>data=dat1)
>Analysis2
>
>Multivariate Meta-Analysis Model (k = 9; method: REML)
>
>Variance Components:
>
>outer factor: ID    (nlvls = 6)
>inner factor: ES_ID (nlvls = 9)
>
>            estim    sqrt  fixed
>tau^2      0.0302  0.1739     no
>rho        0.6011             no
>
>Test for Residual Heterogeneity:
>QE(df = 7) = 33.6446, p-val < .0001
>
>Test of Moderators (coefficient 2):
>QM(df = 1) = 0.8522, p-val = 0.3559
>
>Model Results:
>
>                       estimate      se    zval    pval    ci.lb   ci.ub
>intrcpt                  0.8213  0.0914  8.9849  <.0001   0.6421  1.0004  ***
>factor(HighACEDef)>=4    0.0967  0.1047  0.9232  0.3559  -0.1086  0.3020
>
>---
>Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
>I would much appreciate any help with doing this correctly, or just a
>confirmation that either of the ways I have already done it is correct.
>
>I also had one last question. Is there any way the results can be expressed as
>odds ratios rather than as log odds ratios?
>
>Thank you,
>
>Lukas Sotola
>
>Lukas K. Sotola, M.S.
>PhD Candidate: Iowa State University
>Department of Psychology
>He/him