[R-meta] Inflated confidence intervals
Wasim Iqbal (UG)
W@Iqb@l @ending from newc@@tle@@c@uk
Sun Sep 16 08:36:53 CEST 2018
I have finally understood and a found a simple solution or reason for inflated confidence intervals and I thought I would post it here so that it may help others.
I am conducting a multivariate meta-analysis for an upcoming publication in the Campbell library (campbellcollaboration.org<https://campbellcollaboration.org>). This will involve the analysis of various biomarkers. I had made a list of a posterior contrasts. However, when I had compiled these interactions in a multivariate meta-analysis some models led to wider confidence intervals. At first, I had assumed this was because my model had collapsed or that the variability between studies was great. After a lot of reading and contemplating, I can conclude that this was infact to do with highly correlated outcomes (or multicollinearity).
My multivariate meta-analysis relies on an assumed pearson correlation between outcomes. If a pearson correlation is highly correlated this will lead to either an upward or downward bias of the confidence intervals. This is because the multivariate meta-analysis model restricts the average correlation between -1.00 and +1.00 to produce a symmetric var-cov matrix (or positive definite) so that the model can be formed. Removing outcomes that appear to be highly correlated leads to narrower confidence intervals.
Here is an example of a model with studies looking at serum total cholesterol, high density lipoprotein and low density lipoprotein:
Multivariate Meta-Analysis Model (k = 105; method: REML)
outer factor: Study (nlvls = 35)
inner factor: Measurement (nlvls = 3)
estim sqrt k.lvl fixed level
tau^2.1 0.0812 0.2850 35 no A LDL
tau^2.2 0.0076 0.0874 35 no HDL
tau^2.3 0.0487 0.2208 35 no TC
rho.ALDL rho.HDL rho.TC ALDL HDL TC
A LDL 1 0.4819 0.8774 - no no
HDL 0.4819 1 0.8432 35 - no
TC 0.8774 0.8432 1 35 35 -
Test for Residual Heterogeneity:
QE(df = 102) = 207.8936, p-val < .0001
Test of Moderators (coefficient(s) 2:3):
QM(df = 2) = 18.1033, p-val = 0.0001
estimate se zval pval ci.lb ci.ub
intrcpt -0.2376 0.0648 -3.6655 0.0002 -0.3647 -0.1106 ***
MeasurementHDL 0.1982 0.0513 3.8681 0.0001 0.0978 0.2987 ***
MeasurementTC -0.0148 0.0656 -0.2251 0.8219 -0.1433 0.1138
Signif. codes: 0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
In the above model which includes 35 studies each contributing all three outcomes. Total serum cholesterol (TC) appears to be highly correlated with low density lipoprotein (LDL) and high density lipoprotein (HDL). Therefore, when we use the predict function this leads to a much wider confidence interval than a bivariate model that includes only LDL and HDL.
Model with all three contrasts/interactions:
>predict(resMV,newmods = c(1,1),intercept = T)
pred se ci.lb ci.ub cr.lb cr.ub tau2.level
-0.0541 0.0588 -0.1695 0.0612 NA NA NA
Model with only HDL and LDL:
>pred se ci.lb ci.ub cr.lb cr.ub tau2.level
-0.0394 0.0439 -0.1254 0.0466 NA NA NA
Another reason for multicollinarity or wider confidence intervals is if your multivariate meta-analysis does not include enough studies to estimate an accurate "between study outcomes pearson correlation".
Hope this helps anyone that has experienced a similar problem or confusion.
[[alternative HTML version deleted]]
More information about the R-sig-meta-analysis