[R-meta] confidence interval for I^2 in multilevel model

Michael Dewey lists at dewey.myzen.co.uk
Fri Aug 25 17:49:12 CEST 2017


Dear Kyle

I will let Wolfgang answer your substantive question but you might be 
interested in a couple of R points sicne you mention you are a convert 
from SAS.

(1) You can permanently set something to be a factor
PPVdat$Bethesda <- factor(PPVdat$Bethesda)

(2) There is a subset argument to many R functions including the ones 
you are using so you do not have to set up separate data.frames for each 
value of Bethesda.

There is nothing wrong with doing it as you do but (1) makes it easier 
to read at least to my eyes, (2) reduces clutter in your workspace.

Michael

On 25/08/2017 16:39, Porter, Kyle wrote:
> Wolfgang,
>
> Thank you for the response.  I should have provided more information about the specifics of the meta-analysis we are working on.  The situation is different than the three-level scenario in Konstantopoulos; we have multiple levels of results within study rather than results from multiple studies that are clustered together such as "district" in Konstantopoulos.  I was attempting to account for the correlation in the results within study (in my code "ID" represents the study).  I changed "ID" to "Study" in the message below for clarity.
>
> Our outcome is a single proportion (positive predictive value from the number of true positives (TP) and positive tests (TP + FP, designated PosTest)) and I am using the arcsin transformation.
>
> For each study, we have separated the effect sizes into up to 3 categories based on nodule classification (Bethesda) as either 3, 4, or 5.  So the full dataset has up to 3 rows of data per study with an effect size (PPV) and an indicator (Bethesda) for each specific nodule classification level.
>
> Sample data (first 7 studies; moderator/covariates not shown)
> Study,Bethesda,TP,PosTest
> 1,3,4,5
> 1,4,4,5
> 1,5,2,2
> 2,4,3,5
> 3,3,2,3
> 3,5,10,10
> 4,5,1,1
> 5,4,22,31
> 6,4,2,15
> 7,4,12,21
> 7,5,1,1
>
> I have run separate analyses for each Bethesda level to evaluate the proportional of residual heterogeneity.  I was able to get I^2 and the corresponding confidence intervals for these models:
>
> mregB3 <- rma.uni(measure="PAS", data=B3, xi=TP, ni=posTest, method="REML", mods = ~ factor(Blinded) + prev + factor(HRAS12_13) + factor(KRAS61) + factor(NRAS12_13))
> mregB3
> confint(mregB3)
>
> where B3 is a data frame subset containing only Bethesda level 3 nodules.  I did the same for 4 and 5 separately.
>
> We would also like to estimate I^2 with confidence intervals using all classifications combined (3, 4, 5).  Now we have multiple data points per study, so my thinking was I need to account for potential correlation among these within-study effect sizes.  Having thought further about this since my initial post, I wonder if the existing study-level random effect may be sufficient along with the indicator variables for the Bethesda factor to account for the within-study correlation.  But if I run:
>
> mvregAll2 <- rma(measure="PAS",data=PPVdat, xi=TP, ni=posTest, method="REML", mods = ~ factor(Bethesda) + factor(Blinded) + factor(NRAS12_13) + factor(HRAS12_13) + factor(KRAS61))
>
> then the model is treating each effect size independently, as if it were from a separate study.
>
> So back to my original attempt:
>
> PPVdat <- escalc(measure="PAS", data=RAS_PPV, xi=TP, ni=posTest)
> mvregAll <- rma.mv(data=PPVdat, yi, vi, method="REML", random = ~ 1 | Study, mods = ~ factor(Bethesda) + factor(Blinded) + factor(NRAS12_13) + factor(HRAS12_13) + factor(KRAS61))
>
> The results of mvregAll2 and mvregAll are similar, but I think mvregAll does have the clustered relationship I am looking to include, but perhaps the random effect is conflating study level variation with "sub-study" level variation (variation across each row of data).  Perhaps I need to add a variance component at the effect size level (as if they were separate studies) and then also the study level variance component and specify a model in the three-level form.
>
> I will attempt that and report back.  Any other feedback is appreciated in the meantime.  (Also thanks for pointing out that struct="DIAG" was not necessary and not functional in this model;  I am a SAS user with much less R experience so my "translation" of model specification was off).
>
> Thanks,
> Kyle
>
> -----Original Message-----
> From: Viechtbauer Wolfgang (SP) [mailto:wolfgang.viechtbauer at maastrichtuniversity.nl]
> Sent: Wednesday, August 23, 2017 4:28 PM
> To: Porter, Kyle; 'r-sig-meta-analysis at r-project.org'
> Subject: RE: confidence interval for I^2 in multilevel model
>
> Hi Kyle,
>
> With confint(), you can get CIs for variance components. You can plug the CI bounds for a variance component into the equation for I^2 to obtain a CI for I^2. Here is an example:
>
> dat <- get(data(dat.konstantopoulos2011))
>
> res <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat) res
>
> sav <- confint(res)
> sav
>
> W <- diag(1/dat$vi)
> X <- model.matrix(res)
> P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
> 100 * res$sigma2 / (res$sigma2 + (res$k-res$p)/sum(diag(P))) ### district and school level I^2
> 100 * sav[[1]]$random[1,2:3] / (sav[[1]]$random[1,2:3] + (res$k-res$p)/sum(diag(P))) ### CI for district-level I^2
> 100 * sav[[2]]$random[1,2:3] / (sav[[2]]$random[1,2:3] + (res$k-res$p)/sum(diag(P))) ### CI for the school-level I^2
>
> 100 * sum(res$sigma2) / (sum(res$sigma2) + (res$k-res$p)/sum(diag(P))) ### total I^2
>
> But you cannot use this approach to get a CI for the total I^2. For this, you have to use the 'multivariate parameterization' of this model (see: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.metafor-2Dproject.org_doku.php_analyses-3Akonstantopoulos2011&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=hxh2aEZaqVvq1Q-AYEVNruFcTlW7SyqodvlD9DYBAEk&s=pyRjSQtkISt7XKRSNfqaY_z1vgavriBqDXXixE9S5FI&e= ). So:
>
> res <- rma.mv(yi, vi, random = ~ factor(study) | district, data=dat) res
>
> sav <- confint(res)
> sav
>
> W <- diag(1/dat$vi)
> X <- model.matrix(res)
> P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
> 100 * res$tau2 / (res$tau2 + (res$k-res$p)/sum(diag(P))) ### total I^2
> 100 * sav$random[1,2:3] / (sav$random[1,2:3] + (res$k-res$p)/sum(diag(P))) ### CI for the total I^2
>
> But you are using 'random = ~ 1 | ID', which does not give you a multilevel model. Are you maybe making this mistake?
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.metafor-2Dproject.org_doku.php_analyses-3Akonstantopoulos2011-23a-5Fcommon-5Fmistake-5Fin-5Fthe-5Fthree-2Dlevel-5Fmodel&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=hxh2aEZaqVvq1Q-AYEVNruFcTlW7SyqodvlD9DYBAEk&s=4gv38VBjmsoL43FcL-G8bvOmX2g8qVxzKVXFuTNgR94&e=
>
> Also, 'struct' is only relevant when you have something like a 'random = ~ inner | outer' structure, so struct="DIAG" has no effect.
>
> Best,
> Wolfgang
>
> -----Original Message-----
> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Porter, Kyle
> Sent: Wednesday, August 23, 2017 22:09
> To: 'r-sig-meta-analysis at r-project.org'
> Subject: [R-meta] confidence interval for I^2 in multilevel model
>
> Hello all,
>
> I am using metafor to calculate an I^2 analogue for a multilevel random effects meta-regression as described at https://urldefense.proofpoint.com/v2/url?u=http-3A__www.metafor-2Dproject.org_doku.php_tips-3Ai2-5Fmultilevel-5Fmultivariate&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=hxh2aEZaqVvq1Q-AYEVNruFcTlW7SyqodvlD9DYBAEk&s=fTY0NPh8yHLSU5xGmS7Jw_AwN_BgYfW7LsgNTFTiy6g&e=
>
> I would like help in calculating a 95% confidence interval for this derivation of I^2.
>
> Specifically, I am using the code taken below from the website to generalize the concept of I^2 to a clustered/multilevel model.
>
> (the text below is taken from https://urldefense.proofpoint.com/v2/url?u=http-3A__www.metafor-2Dproject.org_doku.php_tips-3Ai2-5Fmultilevel-5Fmultivariate-29-3A&d=DwIFAw&c=k9MF1d71ITtkuJx-PdWme51dKbmfPEvxwt8SFEkBfs4&r=iXxbs2wBeu4xyserdi9q_GU4EQVOdW3nGznD_IySc58&m=hxh2aEZaqVvq1Q-AYEVNruFcTlW7SyqodvlD9DYBAEk&s=PB2r6RL4qfP4E9ZKl45VK7T46cUlULNdLDv0O0J6nqI&e=
> Based on the discussion above, it is now very easy to generalize the concept of I2 to such a model (see also Nakagawa & Santos, 2012). That is, we can first compute:
>
> W <- diag(1/dat$vi)
> X <- model.matrix(res)
> P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
> 100 * sum(res$sigma2) / (sum(res$sigma2) + (res$k-res$p)/sum(diag(P)))
>
> My specific application is as follows (the outcome is a single proportion):
>
> PPV_prevdat <- escalc(measure="PAS", data=PPV_prev, xi=TP, ni=posTest)
>
> mvregAll <- rma.mv(data=PPV_prevdat, yi, vi, method="REML", random = ~ 1 | ID, struct="DIAG", mods = ~ factor(Bethesda) + factor(Blinded) + prev +factor(NRAS12_13) + factor(HRAS12_13) + factor(KRAS61))
>
> w <- diag(1/PPV_prevdat$vi)
> x <- model.matrix(mvregAll)
> P <- w - w %*% x %*% solve(t(x) %*% w %*% x) %*% t(x) %*% w
> 100 * sum(mvregAll$sigma2) / (sum(mvregAll$sigma2) + (mvregAll$k-mvregAll$p)/sum(diag(P)))
>
> Any help/suggestion is appreciated.
>
> Thank you,
> Kyle Porter
>
> _______________________________________________
> R-sig-meta-analysis mailing list
> R-sig-meta-analysis at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
>
>

-- 
Michael
http://www.dewey.myzen.co.uk/home.html



More information about the R-sig-meta-analysis mailing list