[R-meta] Benefits to metafor when missing vi estimates?

Tue Oct 31 11:54:02 CET 2017

Dear Bronwen,

Simply setting vi=1 for studies where the sampling variance is unknown is not appropriate.

Instead, you might want to use a model as suggested by James (in the post you linked to). In your case, you would have to assume homoscedasticity of the error/sampling variances (instead of assuming that they are inversely proportional to the sample sizes or number of replicates). This can then be followed up by using cluster-robust inference methods, which should also account (at least asymptotically) for the fact that the sampling variances are actually heteroscedastic.

One could also use a model that sets the sampling variances to the known values for those studies where the information required to compute 'vi' is available and estimates 'vi' (under the homoescedasticity assumption) for the remaining studies. With a bit of trickery, this can actually be done with metafor. Here is an example:

library(metafor)

dat <- get(data(dat.konstantopoulos2011))

### fit multilevel model
res <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat)
res

### pretend that 'vi' is only known for a subset of the studies
### and set 'vi' to 0 for studies where 'vi' is unknown
set.seed(1235)
dat$viknown <- 0
dat$viknown[sample(1:nrow(dat), 10)] <- 1
dat$vi[dat$viknown == 0] <- 0

### fit model that estimates the sampling variance for studies where 'vi' is unknown
### (assuming that the sampling variance is homoscedastic for those studies)
res <- rma.mv(yi, vi, random = list(~ 1 | district/school, ~ factor(viknown) | study), struct="DIAG", tau2=c(NA,0), data=dat)
res

You would want to follow this up with cluster-robust inference methods again, since we know that 'vi' is not homoescedastic in studies where it was unknown. So:

robust(res, cluster=dat$district)

Or more refined:

library(clubSandwich)
coef_test(res, vcov="CR2")

That seems like quite a bit of work though instead of just:

library(nlme)
res <- lme(yi ~ 1, random = ~ 1 | district, data=dat)
coef_test(res, vcov="CR2")

Best,
Wolfgang

-- 
Wolfgang Viechtbauer, Ph.D., Statistician | Department of Psychiatry and 
Neuropsychology | Maastricht University | P.O. Box 616 (VIJV1) | 6200 MD 
Maastricht, The Netherlands | +31 (43) 388-4170 | http://www.wvbauer.com 

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Bronwen Stanford
Sent: Monday, 30 October, 2017 19:17
To: r-sig-meta-analysis at r-project.org
Subject: [R-meta] Benefits to metafor when missing vi estimates?

I am conducting a meta-analysis on a dataset that contains sample size and
error estimates for only 15% of the data points. I'm constructing a
mixed-effects (multi-level) model using rma.mv, and the model includes one
random effect (representing study) and multiple fixed effects, both
continuous and categorical. I have been advised to use metafor and assign a
constant value to vi (e.g. vi=1) for all data points without error
estimates to improve the model estimates of standard errors.  However,
based on answers such as

https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2017-October/000252.html

this seems like potentially an inappropriate use of metafor - I'm telling
the model I have information about variance when variance is in fact
unknown (and my dataset does not qualify for a "true" meta-analysis).

My coefficient estimates using metafor (with vi=1) and lmer (or lme) are
also different (in both magnitude and significance), which concerns me. Any
thoughts on the most appropriate way to approach this less-than-ideal
dataset? Does using metafor in this case (with a constant vi value) improve
model accuracy, or is it reasonable to stick with standard mixed-effects
modeling packages?

Thanks!