# [R-meta] Benefits to metafor when missing vi estimates?

Bronwen Stanford bstanfor at ucsc.edu
Fri Nov 10 01:41:56 CET 2017

```Thank you very much for this.

I'd like to make sure I'm interpreting the code you provided for estimating
variance within metafor correctly. My understanding is that the tau2 term
is setting the variance associated with each individual row ("number")
equal to zero, and applying one level of variance to those studies with
variance known, and a different level to studies with variance unknown (so
viknown =1 and viknown=0 get different estimates). This allows the model to
apply a larger tau2 value to the studies without variance if needed, to
compensate for the fact that their vi values are set at 0. Is that right?

Would I be correct in thinking that for those points without variance this
model behaves very similarly to the nlme model, and the main benefit is
that I can use the provided variance for those 15% of points with variance
included?

Another possibility that has been suggested to me is using the data points
with known variance to estimate one I2 for the entire dataset, and then
using this to calculate vi. This would result in one vi value for the
entire dataset, which seems like it has similar problems as setting all
vi=1. Do you see benefits to an approach like this?

Thank you so much
Bronwen

Bronwen Stanford
Ph.D. Candidate
Environmental Studies Department
University of California, Santa Cruz

On Tue, Oct 31, 2017 at 3:54 AM, Viechtbauer Wolfgang (SP) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:

> Dear Bronwen,
>
> Simply setting vi=1 for studies where the sampling variance is unknown is
> not appropriate.
>
> Instead, you might want to use a model as suggested by James (in the post
> you linked to). In your case, you would have to assume homoscedasticity of
> the error/sampling variances (instead of assuming that they are inversely
> proportional to the sample sizes or number of replicates). This can then be
> followed up by using cluster-robust inference methods, which should also
> account (at least asymptotically) for the fact that the sampling variances
> are actually heteroscedastic.
>
> One could also use a model that sets the sampling variances to the known
> values for those studies where the information required to compute 'vi' is
> available and estimates 'vi' (under the homoescedasticity assumption) for
> the remaining studies. With a bit of trickery, this can actually be done
> with metafor. Here is an example:
>
> library(metafor)
>
> dat <- get(data(dat.konstantopoulos2011))
>
> ### fit multilevel model
> res <- rma.mv(yi, vi, random = ~ 1 | district/school, data=dat)
> res
>
> ### pretend that 'vi' is only known for a subset of the studies
> ### and set 'vi' to 0 for studies where 'vi' is unknown
> set.seed(1235)
> dat\$viknown <- 0
> dat\$viknown[sample(1:nrow(dat), 10)] <- 1
> dat\$vi[dat\$viknown == 0] <- 0
>
> ### fit model that estimates the sampling variance for studies where 'vi'
> is unknown
> ### (assuming that the sampling variance is homoscedastic for those
> studies)
> res <- rma.mv(yi, vi, random = list(~ 1 | district/school, ~
> factor(viknown) | study), struct="DIAG", tau2=c(NA,0), data=dat)
> res
>
> You would want to follow this up with cluster-robust inference methods
> again, since we know that 'vi' is not homoescedastic in studies where it
> was unknown. So:
>
> robust(res, cluster=dat\$district)
>
> Or more refined:
>
> library(clubSandwich)
> coef_test(res, vcov="CR2")
>
> That seems like quite a bit of work though instead of just:
>
> library(nlme)
> res <- lme(yi ~ 1, random = ~ 1 | district, data=dat)
> coef_test(res, vcov="CR2")
>
> Best,
> Wolfgang
>
> --
> Wolfgang Viechtbauer, Ph.D., Statistician | Department of Psychiatry and
> Neuropsychology | Maastricht University | P.O. Box 616 (VIJV1) | 6200 MD
> Maastricht, The Netherlands | +31 (43) 388-4170 | http://www.wvbauer.com
>
> -----Original Message-----
> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-
> bounces at r-project.org] On Behalf Of Bronwen Stanford
> Sent: Monday, 30 October, 2017 19:17
> To: r-sig-meta-analysis at r-project.org
> Subject: [R-meta] Benefits to metafor when missing vi estimates?
>
> I am conducting a meta-analysis on a dataset that contains sample size and
> error estimates for only 15% of the data points. I'm constructing a
> mixed-effects (multi-level) model using rma.mv, and the model includes one
> random effect (representing study) and multiple fixed effects, both
> continuous and categorical. I have been advised to use metafor and assign a
> constant value to vi (e.g. vi=1) for all data points without error
> estimates to improve the model estimates of standard errors.  However,
> based on answers such as
>
> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/
> 2017-October/000252.html
>
> this seems like potentially an inappropriate use of metafor - I'm telling
> the model I have information about variance when variance is in fact
> unknown (and my dataset does not qualify for a "true" meta-analysis).
>
> My coefficient estimates using metafor (with vi=1) and lmer (or lme) are
> also different (in both magnitude and significance), which concerns me. Any
> thoughts on the most appropriate way to approach this less-than-ideal
> dataset? Does using metafor in this case (with a constant vi value) improve
> model accuracy, or is it reasonable to stick with standard mixed-effects
> modeling packages?
>
> Thanks!
>

[[alternative HTML version deleted]]

```