[R-meta] confidence interval for I^2 in multilevel model

Fri Aug 25 18:16:32 CEST 2017

I think I may have ended up with an over-specified model.  The result was an estimate of zero for sigma^2.2 for the Study/EffectSizeEntry factor.  (EffectSizeEntry is the row number in my data; each effect size was given a unique identifier.  It equates to each Study/Bethesda combination).  My thought was this specification was analogous to the three-level in Konstantopoulos with my EffectSizeEntry equating to his Study level and my Study equating to his District level.

mvregML <- rma.mv(data=PPV_prevdat, yi, vi, random = ~ 1 | Study/EffectSizeEntry, mods = ~  Bethesda + prev + Blinded + NRAS12_13 + HRAS12_13 + KRAS61)

Even if I take out the moderators I get the zero estimate for sigma^2.2 for the Study/EffectSizeEntry factor.

mvregML <- rma.mv(data=PPV_prevdat, yi, vi, random = ~ 1 | Study/EffectSizeEntry)

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Porter, Kyle
Sent: Friday, August 25, 2017 11:40 AM
To: 'Viechtbauer Wolfgang (SP)'; 'r-sig-meta-analysis at r-project.org'
Subject: Re: [R-meta] confidence interval for I^2 in multilevel model

Wolfgang,

Thank you for the response.  I should have provided more information about the specifics of the meta-analysis we are working on.  The situation is different than the three-level scenario in Konstantopoulos; we have multiple levels of results within study rather than results from multiple studies that are clustered together such as "district" in Konstantopoulos.  I was attempting to account for the correlation in the results within study (in my code "ID" represents the study).  I changed "ID" to "Study" in the message below for clarity.

Our outcome is a single proportion (positive predictive value from the number of true positives (TP) and positive tests (TP + FP, designated PosTest)) and I am using the arcsin transformation.

For each study, we have separated the effect sizes into up to 3 categories based on nodule classification (Bethesda) as either 3, 4, or 5.  So the full dataset has up to 3 rows of data per study with an effect size (PPV) and an indicator (Bethesda) for each specific nodule classification level.

Sample data (first 7 studies; moderator/covariates not shown) Study,Bethesda,TP,PosTest
1,3,4,5
1,4,4,5
1,5,2,2
2,4,3,5
3,3,2,3
3,5,10,10
4,5,1,1
5,4,22,31
6,4,2,15
7,4,12,21
7,5,1,1

I have run separate analyses for each Bethesda level to evaluate the proportional of residual heterogeneity.  I was able to get I^2 and the corresponding confidence intervals for these models:

mregB3 <- rma.uni(measure="PAS", data=B3, xi=TP, ni=posTest, method="REML", mods = ~ factor(Blinded) + prev + factor(HRAS12_13) + factor(KRAS61) + factor(NRAS12_13))
mregB3
confint(mregB3)

where B3 is a data frame subset containing only Bethesda level 3 nodules.  I did the same for 4 and 5 separately.

We would also like to estimate I^2 with confidence intervals using all classifications combined (3, 4, 5).  Now we have multiple data points per study, so my thinking was I need to account for potential correlation among these within-study effect sizes.  Having thought further about this since my initial post, I wonder if the existing study-level random effect may be sufficient along with the indicator variables for the Bethesda factor to account for the within-study correlation.  But if I run:

mvregAll2 <- rma(measure="PAS",data=PPVdat, xi=TP, ni=posTest, method="REML", mods = ~ factor(Bethesda) + factor(Blinded) + factor(NRAS12_13) + factor(HRAS12_13) + factor(KRAS61))

then the model is treating each effect size independently, as if it were from a separate study.

So back to my original attempt:

PPVdat <- escalc(measure="PAS", data=RAS_PPV, xi=TP, ni=posTest) mvregAll <- rma.mv(data=PPVdat, yi, vi, method="REML", random = ~ 1 | Study, mods = ~ factor(Bethesda) + factor(Blinded) + factor(NRAS12_13) + factor(HRAS12_13) + factor(KRAS61))

The results of mvregAll2 and mvregAll are similar, but I think mvregAll does have the clustered relationship I am looking to include, but perhaps the random effect is conflating study level variation with "sub-study" level variation (variation across each row of data).  Perhaps I need to add a variance component at the effect size level (as if they were separate studies) and then also the study level variance component and specify a model in the three-level form.

I will attempt that and report back.  Any other feedback is appreciated in the meantime.  (Also thanks for pointing out that struct="DIAG" was not necessary and not functional in this model;  I am a SAS user with much less R experience so my "translation" of model specification was off).

Thanks,
Kyle

-----Original Message-----
From: Viechtbauer Wolfgang (SP) [mailto:wolfgang.viechtbauer at maastrichtuniversity.nl]
Sent: Wednesday, August 23, 2017 4:28 PM
To: Porter, Kyle; 'r-sig-meta-analysis at r-project.org'
Subject: RE: confidence interval for I^2 in multilevel model

Hi Kyle,

With confint(), you can get CIs for variance components. You can plug the CI bounds for a variance component into the equation for I^2 to obtain a CI for I^2. Here is an example:

dat <- get(data(dat.konstantopoulos2011))

res <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat) res

sav <- confint(res)
sav

W <- diag(1/dat$vi)
X <- model.matrix(res)
P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
100 * res$sigma2 / (res$sigma2 + (res$k-res$p)/sum(diag(P))) ### district and school level I^2
100 * sav[[1]]$random[1,2:3] / (sav[[1]]$random[1,2:3] + (res$k-res$p)/sum(diag(P))) ### CI for district-level I^2
100 * sav[[2]]$random[1,2:3] / (sav[[2]]$random[1,2:3] + (res$k-res$p)/sum(diag(P))) ### CI for the school-level I^2

100 * sum(res$sigma2) / (sum(res$sigma2) + (res$k-res$p)/sum(diag(P))) ### total I^2

But you cannot use this approach to get a CI for the total I^2. For this, you have to use the 'multivariate parameterization' of this model (see: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.metafor-2Dproject.org_doku.php_analyses-3Akonstantopoulos2011&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=hxh2aEZaqVvq1Q-AYEVNruFcTlW7SyqodvlD9DYBAEk&s=pyRjSQtkISt7XKRSNfqaY_z1vgavriBqDXXixE9S5FI&e= ). So:

res <- rma.mv(yi, vi, random = ~ factor(study) | district, data=dat) res

sav <- confint(res)
sav

W <- diag(1/dat$vi)
X <- model.matrix(res)
P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
100 * res$tau2 / (res$tau2 + (res$k-res$p)/sum(diag(P))) ### total I^2
100 * sav$random[1,2:3] / (sav$random[1,2:3] + (res$k-res$p)/sum(diag(P))) ### CI for the total I^2

But you are using 'random = ~ 1 | ID', which does not give you a multilevel model. Are you maybe making this mistake?

https://urldefense.proofpoint.com/v2/url?u=http-3A__www.metafor-2Dproject.org_doku.php_analyses-3Akonstantopoulos2011-23a-5Fcommon-5Fmistake-5Fin-5Fthe-5Fthree-2Dlevel-5Fmodel&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=hxh2aEZaqVvq1Q-AYEVNruFcTlW7SyqodvlD9DYBAEk&s=4gv38VBjmsoL43FcL-G8bvOmX2g8qVxzKVXFuTNgR94&e= 

Also, 'struct' is only relevant when you have something like a 'random = ~ inner | outer' structure, so struct="DIAG" has no effect.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Porter, Kyle
Sent: Wednesday, August 23, 2017 22:09
To: 'r-sig-meta-analysis at r-project.org'
Subject: [R-meta] confidence interval for I^2 in multilevel model

Hello all,

I am using metafor to calculate an I^2 analogue for a multilevel random effects meta-regression as described at https://urldefense.proofpoint.com/v2/url?u=http-3A__www.metafor-2Dproject.org_doku.php_tips-3Ai2-5Fmultilevel-5Fmultivariate&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=hxh2aEZaqVvq1Q-AYEVNruFcTlW7SyqodvlD9DYBAEk&s=fTY0NPh8yHLSU5xGmS7Jw_AwN_BgYfW7LsgNTFTiy6g&e= 

I would like help in calculating a 95% confidence interval for this derivation of I^2.

Specifically, I am using the code taken below from the website to generalize the concept of I^2 to a clustered/multilevel model.

(the text below is taken from https://urldefense.proofpoint.com/v2/url?u=http-3A__www.metafor-2Dproject.org_doku.php_tips-3Ai2-5Fmultilevel-5Fmultivariate-29-3A&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=hxh2aEZaqVvq1Q-AYEVNruFcTlW7SyqodvlD9DYBAEk&s=PB2r6RL4qfP4E9ZKl45VK7T46cUlULNdLDv0O0J6nqI&e=
Based on the discussion above, it is now very easy to generalize the concept of I2 to such a model (see also Nakagawa & Santos, 2012). That is, we can first compute:

W <- diag(1/dat$vi)
X <- model.matrix(res)
P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
100 * sum(res$sigma2) / (sum(res$sigma2) + (res$k-res$p)/sum(diag(P)))

My specific application is as follows (the outcome is a single proportion):

PPV_prevdat <- escalc(measure="PAS", data=PPV_prev, xi=TP, ni=posTest)

mvregAll <- rma.mv(data=PPV_prevdat, yi, vi, method="REML", random = ~ 1 | ID, struct="DIAG", mods = ~ factor(Bethesda) + factor(Blinded) + prev +factor(NRAS12_13) + factor(HRAS12_13) + factor(KRAS61))

w <- diag(1/PPV_prevdat$vi)
x <- model.matrix(mvregAll)
P <- w - w %*% x %*% solve(t(x) %*% w %*% x) %*% t(x) %*% w
100 * sum(mvregAll$sigma2) / (sum(mvregAll$sigma2) + (mvregAll$k-mvregAll$p)/sum(diag(P)))

Any help/suggestion is appreciated.

Thank you,
Kyle Porter

_______________________________________________
R-sig-meta-analysis mailing list
R-sig-meta-analysis at r-project.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dmeta-2Danalysis&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=cL0nylYpakMIH5v53xMlrI5ibJAnP_QDkcuuJrfcg3s&s=csTsEClS5f8ANy5yUKOf_W6cppA5CcbG4FiEedUwYrk&e=