[R-meta] I-square formula

Sun Mar 10 11:34:38 CET 2019

Dear Jan,

Using the data from help(dat.ishak200) as a fully reproducible example:

### load data
dat <- get(data(dat.ishak2007))

### create long format dataset
dat.long <- reshape(dat, direction="long", idvar="study", v.names=c("yi","vi"),
                    varying=list(c(2,4,6,8), c(3,5,7,9)))
dat.long <- dat.long[order(dat.long$study, dat.long$time),]
rownames(dat.long) <- 1:nrow(dat.long)

### remove missing measurement occasions from dat.long
is.miss  <- is.na(dat.long$yi)
dat.long <- dat.long[!is.miss,]

### construct the full (block diagonal) V matrix with an AR(1) structure
rho.within <- .97 ### value as estimated by Ishak et al. (2007)
V <- lapply(split(with(dat, cbind(v1i, v2i, v3i, v4i)), dat$study), diag)
V <- lapply(V, function(v) sqrt(v) %*% toeplitz(ARMAacf(ar=rho.within, lag.max=3)) %*% sqrt(v))
V <- bldiag(V)
V <- V[!is.miss,!is.miss] ### remove missing measurement occasions from V

### multivariate model with heteroscedastic AR(1) structure for the true effects
res <- rma.mv(yi, V, mods = ~ factor(time) - 1, random = ~ time | study,
              struct = "HAR", data = dat.long)
print(res, digits=2)

### compute I^2 (per timepoint)
W <- solve(V)
X <- model.matrix(res)
P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
100 * res$tau2 / (res$tau2 + (res$k-res$p)/sum(diag(P)))

### compute 'typical' sampling variance per timepoint
sapply(1:4, function(i) 100 * res$tau2[i] / (res$tau2[i] + (sum(dat.long$time == i)-1)/sum(diag(P)[dat.long$time == i])))

### same idea conceptually as Jackson et al. (2012) approach
res.R <- res
res.F <- rma.mv(yi, V, mods = ~ factor(time) - 1, data = dat.long)
sapply(1:4, function(i) 100 * (vcov(res.R)[i,i] - vcov(res.F)[i,i]) / vcov(res.R)[i,i])

Note that I am computing I^2-type statistics for each timepoint, which I would recommend (instead of a single value).

The first two approaches give essentially the same results (I^2 values of around 94% at each timepoint). The approach based on Jackson et al. (2012) gives slightly lower values (80-86%, depending on the timepoint), although the conclusion is the same, namely that the amount of residual heterogeneity is relatively large.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Jan Stochl
Sent: Friday, 01 March, 2019 15:36
To: r-sig-meta-analysis using r-project.org
Subject: [R-meta] I-square formula

Dear meta-analysers,
I am writing with a kind request for your help. I am struggling to come up with a correct solution for the computation of I-squared. I would be very grateful if anyone could confirm the formulas I have used (at the end of the email).

In essence, we did meta-analyses in the metafor package where the studies are nested within timepoints. The first few rows of the data (object temp) looks like this:

temp <- structure(list(Study = c("Addington et al. (2011)", "Addington et al. (2011)", 
"Addington et al. (2011)", "Ising et al. (2016)", "McGorry et al. (2013)", 
"Morrison et al. (2012)", "Morrison et al. (2012)", "Morrison et al. (2012)", 
"Stain et al. (2016)", "Stain et al. (2016)"), Comparison = c("CBT vs. ST", 
"CBT vs. ST", "CBT vs. ST", "CBT vs. TAU", "CBT vs. ST", "CBT vs. TAU", 
"CBT vs. TAU", "CBT vs. TAU", "CBT vs. ST", "CBT vs. ST"), Time = c(1, 
2, 3, 3, 1, 1, 2, 3, 1, 2), Time_2 = c("post-intervention", "6 months post-intervention", 
"12 months post-intervention", "42 months post-intervention", 
"post-intervention", "post-intervention", "6 months post-intervention", 
"18 months post-intervention", "post-intervention", "6 months post-intervention"
), yi = structure(c(0.211356631477208, 0.185954935595866, 0.177021020023009, 
-0.164257284037062, 0.437686730662269, -0.00209427564031285, 
-0.0614642122294613, -0.245724945196732, -0.572957258413666, 
-0.341534117584298), measure = "SMD", ni = c(35, 31, 28, 88, 
48, 185, 183, 61, 31, 22), slab = c("Addington et al. (2011).1", 
"Addington et al. (2011).6", "Addington et al. (2011).11", "Ising et al. (2016).1", 
"McGorry et al. (2013).1", "Morrison et al. (2012).1", "Morrison et al. (2012).7", 
"Morrison et al. (2012).13", "Stain et al. (2016).1", "Stain et al. (2016).7"
)), vi = c(0.11576974502836, 0.129724396312997, 0.144149322902779, 
0.0458201376889337, 0.0891098550076188, 0.0216222652434626, 0.0218741212760093, 
0.0665122401869221, 0.134461505698423, 0.185984368639563)), yi.names = "yi", vi.names = "vi", digits = 4, row.names = c(1L, 
36L, 62L, 79L, 11L, 18L, 44L, 73L, 24L, 50L), class = c("escalc", 
"data.frame"))

temp

Study Outcome Comparison Time Time_2 yi vi
Addington et al. (2011) Depression CBT vs. ST 1 post-intervention 0.211356631 0.115769745
Addington et al. (2011) Depression CBT vs. ST 2 6 months post-intervention 0.185954936 0.129724396
Addington et al. (2011) Depression CBT vs. ST 3 12 months post-intervention 0.17702102 0.144149323
Ising et al. (2016) Depression CBT vs. TAU 3 42 months post-intervention -0.164257284 0.045820138
McGorry et al. (2013) Depression CBT vs. ST 1 post-intervention 0.437686731 0.089109855
Morrison et al. (2012) Depression CBT vs. TAU 1 post-intervention -0.002094276 0.021622265
Morrison et al. (2012) Depression CBT vs. TAU 2 6 months post-intervention -0.061464212 0.021874121
Morrison et al. (2012) Depression CBT vs. TAU 3 18 months post-intervention -0.245724945 0.06651224
Stain et al. (2016) Depression CBT vs. ST 1 post-intervention -0.572957258 0.134461506
Stain et al. (2016) Depression CBT vs. ST 2 6 months post-intervention -0.341534118 0.185984369

# Computation of variance-covariance matrix
rho.within <- .97
V <- lapply(split(with(temp_wide, cbind(vi.1, vi.2, vi.3)), temp_wide$Study), diag)
V <- lapply(V, function(v) sqrt(v) %*%  toeplitz(ARMAacf(ar=rho.within, lag.max=2)) %*% sqrt(v))
V <- bldiag(V)
keep <- c(1:3,6,7,10:14)
V <- V[keep,keep]

 After a lot of consideration I decided to use the following models, also because we wanted to estimate the overall effect as well as specific effects for the levels in column "Comparison":

res <- rma.mv(yi=yi, V, data = temp, random = ~ Time + 1 | Study, struct = "HAR", rho=c(NA), intercept=TRUE)
res_st <- rma.mv(yi=yi, V, data = temp, random = ~ Time | Study, struct = "HAR", subset=temp$Comparison=="CBT vs. ST") 
res_tau <- rma.mv(yi=yi, V, data = temp, random = ~ Time | Study, struct = "HAR", subset=temp$Comparison=="CBT vs. TAU")

As an example, and for easier understanding of variance components of the model I am sending the example output:

summary(res)

Multivariate Meta-Analysis Model (k = 10; method: REML)

  logLik  Deviance       AIC       BIC      AICc  
  3.8815   -7.7630    2.2370    3.2231   22.2370  

Variance Components: 

outer factor: Study (nlvls = 5)
inner factor: Time  (nlvls = 3)

            estim    sqrt  k.lvl  fixed  level
tau^2.1    0.0989  0.3145      4     no      1
tau^2.2    0.0427  0.2067      3     no      2
tau^2.3    0.0047  0.0685      3     no      3
rho        1.0000                    no       

Test for Heterogeneity: 
Q(df = 9) = 20.8144, p-val = 0.0135

Model Results:

estimate      se     zval    pval    ci.lb   ci.ub   
 -0.0587  0.1058  -0.5544  0.5793  -0.2661  0.1487   

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Now I am struggling a bit with the correct formula and computation for I-squared for these models. I found Wolfgang's page (http://www.metafor-project.org/doku.php/tips:i2_multilevel_multivariate [http://www.metafor-project.org/doku.php/tips:i2_multilevel_multivariate]) which was extremely helpful, however, I am not entirely sure I am using the correct one. So for these specific models, I am using the following code to compute I-squared and it would be extremely helpful if you can confirm these are correct. 

# for res model
W <- diag(1/temp$vi)
X <- model.matrix(res)
P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
i2_res <- 100 * sum(res$tau2) / (sum(res$tau2) + (res$k-res$p)/sum(diag(P)))

# for res_st model
W <- diag(1/temp[temp$Comparison=="CBT vs. ST",]$vi)
X <- model.matrix(res_st)
P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
i2_res_st <- 100 * sum(res_st$tau2) / (sum(res_st$tau2) + (res_st$k-res_st$p)/sum(diag(P)))

# for res_tau model
W <- diag(1/temp[temp$Comparison=="CBT vs. TAU",]$vi)
X <- model.matrix(res_tau)
P <- W - W %*% X %*% solve(t(X) %*% W %*% X) %*% t(X) %*% W
i2_res_tau <- 100 * sum(res_tau$tau2) / (sum(res_tau$tau2) + (res_tau$k-res_tau$p)/sum(diag(P)))

Thank you very much for your kind consideration and help. I know it is a big favour from you. I am sure we can acknowledge your kind help in the publication which is nearing completion...

Kind regards,
Jan
__________________________
Jan Stochl, Ph.D.
Senior Research Associate
Department of Psychiatry
University of Cambridge
Cambridge Biomedical Campus, Box 189
Cambridge, CB2 0QQ, UK

Tel: 44 (0)7587146299
Email: js883 using cam.ac.uk [mailto:js883 using cam.ac.uk]