[R-meta] Inflated confidence intervals

Viechtbauer, Wolfgang (SP) wolfg@ng@viechtb@uer @ending from m@@@trichtuniver@ity@nl
Sun Sep 16 15:30:01 CEST 2018

I apparently haven't had enough coffee today, so first a correction on my part:

predict(resMV, newmods = c(1,0)) gives the estimated (average) outcome for HDL (i.e., -0.2376 + 0.1982 = -0.0394). The coefficient for HDL (i.e., 0.1982) is already the difference between HDL and LDL.

But you seem to be after something different. Apparently, you want to add the intecept and the two coefficients together, so: -0.2376 + 0.1982 + -0.0148 =~ -0.0541, which indeed you would obtain with predict(resMV, newmods = c(1,1)). But what is the meaning of this?

If you want a marginal mean, that is, the average of the three outcomes, then you would want:

intercept + 1/3 * HDL + 1/3 * TC

(assuming the intercept corresponds to LDL, as in the output you showed), which you would get with predict(resMV, newmods = c(1/3,1/3)). But maybe I am still misunderstanding.


-----Original Message-----
From: Wasim Iqbal (UG) [mailto:W.Iqbal using newcastle.ac.uk] 
Sent: Sunday, 16 September, 2018 14:35
To: Viechtbauer, Wolfgang (SP)
Cc: r-sig-meta-analysis using r-project.org; Gavin Stewart; Chris Seal
Subject: RE: Inflated confidence intervals

Just to clarify, 

If we were to obtain the overall summary measure for the model (not the difference between the parameters) we would add the intercept to the difference, like this to obtain the adjusted mean:

A) Y=Mx+b+b
B) Y= intercept + HDL + LDL

We are going to show the adjusted mean with the unadjusted mean in our publication. 

Thank you 

Kind regards 

On 16 Sep 2018 12:53 pm, "Viechtbauer, Wolfgang (SP)" <wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
Sorry, you just lost me.

To repeat:

For the model you showed (which apparently was fitted with 'mods = ~ measurement', since it includes an intercept term), if you want to know the difference between HDL and LDL, then you should use:

predict(resMV, newmods = c(1,0))

which is identical to:

predict(resMV, newmods = c(1,0), intercept=TRUE)

As I showed, this gives the same point estimate for the difference between HDL and LDL as obtained from the model with only the two outcomes.


-----Original Message-----
From: Wasim Iqbal (UG) [mailto:W.Iqbal using newcastle.ac.uk] 
Sent: Sunday, 16 September, 2018 13:04
To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis using r-project.org
Cc: Gavin Stewart; Chris Seal
Subject: Re: Inflated confidence intervals

Dear Wolfgang, 

Thank you, however I believe I have done this in the final predict function in the email. I was trying to show how this was different from the full model and a model that included only LDL and HDL.

I did not include the intercept in the initial code:

>rma.mv(x, mods=measurement) as opposed to >rma.mv(x, mods=measurement-1)

Then, when I call the command: >predict (x, newmods=c(1,1), intercept=T), surely I must include intercept as true? Otherwise this does not include it unless I had include it in the initial code.

Kind regards
From: R-sig-meta-analysis <r-sig-meta-analysis-bounces using r-project.org> on behalf of Viechtbauer, Wolfgang (SP) <wolfgang.viechtbauer using maastrichtuniversity.nl>
Sent: 16 September 2018 11:42:23
To: Wasim Iqbal (UG); r-sig-meta-analysis using r-project.org
Cc: Gavin Stewart; Chris Seal
Subject: Re: [R-meta] Inflated confidence intervals 
Dear Wasim,

It seems like you are trying to obtained the estimated (average) difference between HDL and LDL. Based on the model with only HDL and LDL, this is equal to -0.0394 (with 95% CI: -0.1254 to 0.0466). However, your code to obtain the analogous value from the model with all three outcomes is not correct. It should be:

predict(resMV, newmods = c(1,0))

(intercept=TRUE isn't needed, since this is the default here). Then this will yield: -0.2376 + 0.1982 = -0.0394, which is the exact same value as obtained from the model with only the two outcomes. I cannot tell you what the CI will be based on the output provided, but I suspect you will get something very similar as you obtained from the model with only the two outcomes.


-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Wasim Iqbal (UG)
Sent: Sunday, 16 September, 2018 8:37
To: r-sig-meta-analysis using r-project.org
Cc: Gavin Stewart; Chris Seal
Subject: [R-meta] Inflated confidence intervals

I have finally understood and a found a simple solution or reason for inflated confidence intervals and I thought I would post it here so that it may help others.

I am conducting a multivariate meta-analysis for an upcoming publication in the Campbell library (campbellcollaboration.org<https://campbellcollaboration.org>). This will involve the analysis of various biomarkers. I had made a list of a posterior contrasts. However, when I had compiled these interactions in a multivariate meta-analysis some models led to wider confidence intervals. At first, I had assumed this was because my model had collapsed or that the variability between studies was great. After a lot of reading and contemplating, I can conclude that this was infact to do with highly correlated outcomes (or multicollinearity).

My multivariate meta-analysis relies on an assumed pearson correlation between outcomes. If a pearson correlation is highly correlated this will lead to either an upward or downward bias of the confidence intervals. This is because the multivariate meta-analysis model restricts the average correlation between -1.00 and +1.00 to produce a symmetric var-cov matrix (or positive definite) so that the model can be formed. Removing outcomes that appear to be highly correlated leads to narrower confidence intervals.

Here is an example of a model with studies looking at serum total cholesterol, high density lipoprotein and low density lipoprotein:

Multivariate Meta-Analysis Model (k = 105; method: REML)

Variance Components:

outer factor: Study       (nlvls = 35)
inner factor: Measurement (nlvls = 3)

            estim    sqrt  k.lvl  fixed  level
tau^2.1    0.0812  0.2850     35     no  A LDL
tau^2.2    0.0076  0.0874     35     no    HDL
tau^2.3    0.0487  0.2208     35     no     TC

       rho.ALDL  rho.HDL  rho.TC    ALDL  HDL  TC
A LDL         1   0.4819  0.8774       -   no  no
HDL      0.4819        1  0.8432      35    -  no
TC       0.8774   0.8432       1      35   35   -

Test for Residual Heterogeneity:
QE(df = 102) = 207.8936, p-val < .0001

Test of Moderators (coefficient(s) 2:3):
QM(df = 2) = 18.1033, p-val = 0.0001

Model Results:

                estimate      se     zval    pval    ci.lb    ci.ub
intrcpt          -0.2376  0.0648  -3.6655  0.0002  -0.3647  -0.1106  ***
MeasurementHDL    0.1982  0.0513   3.8681  0.0001   0.0978   0.2987  ***
MeasurementTC    -0.0148  0.0656  -0.2251  0.8219  -0.1433   0.1138

Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1

In the above model which includes 35 studies each contributing all three outcomes. Total serum cholesterol (TC) appears to be highly correlated with low density lipoprotein (LDL) and high density lipoprotein (HDL). Therefore, when we use the predict function this leads to a much wider confidence interval than a bivariate model that includes only LDL and HDL.

Model with all three contrasts/interactions:
 >predict(resMV,newmods = c(1,1),intercept = T)
    pred     se   ci.lb  ci.ub cr.lb cr.ub tau2.level
 -0.0541 0.0588 -0.1695 0.0612    NA    NA         NA

Model with only HDL and LDL:
 >pred     se   ci.lb  ci.ub cr.lb cr.ub tau2.level
 -0.0394 0.0439 -0.1254 0.0466    NA    NA         NA

Another reason for multicollinarity or wider confidence intervals is if your multivariate meta-analysis does not include enough studies to estimate an accurate "between study outcomes pearson correlation".

Hope this helps anyone that has experienced a similar problem or confusion.

Kind regards
Wasim Iqbal

More information about the R-sig-meta-analysis mailing list