[R-meta] Differences between metafor / robumeta / lme4

Sun Jul 12 21:09:21 CEST 2020

Bernard,

Wolfgang read my mind, as usual. Below are some functions and an example of
how to do the first, bootstrap-based approach. The result of boot_cor() is
essentially the same as Wolfgang's rmat function, but it uses a basic
bootstrap to get the variance covariance matrix of the correlation
estimates. My understanding is that the equation-based approach involves
approximations that assume multivariate normality and a sufficiently large
sample size within a given study. The basic bootstrap approach does not
rely on multivariate normality (although it does require a sufficiently
large sample size), and so it might be preferred due to this robustness
property.

As Wolfgang explained, whichever function you use, you'll need to apply it
to each sample within your meta-analysis, then stack up the datasets and
variance-covariance matrices for use in rma.mv().

James

# calculate (z-transformed) correlations, as a vector

cor_vec <- function(data, Ztrans = TRUE) {
  r_mat <- cor(data)
  r_vec <- r_mat[lower.tri(r_mat)]

  if (Ztrans) return(atanh(r_vec)) else return(r_vec)
}

cor_vec(X)

# bootstrap vcov matrix

boot_cor <- function(data, boots = 1999, Ztrans = TRUE) {
  est <- cor_vec(data, Ztrans = Ztrans)
  boot_ests <- replicate(n = boots, {
    boot_index <- sample.int(nrow(data), replace = TRUE)
    cor_vec(data[boot_index,], Ztrans = Ztrans)
  }, simplify = FALSE)
  vcov_mat <- cov(do.call(rbind, boot_ests))

  rcols <- combn(names(X),2)
  r_names <- apply(rcols, 2, paste, collapse = ":")
  est <- data.frame(est = est, var1 = rcols[1,], var2 = rcols[2,], var1var2
= r_names, stringsAsFactors = FALSE)
  colnames(vcov_mat) <- rownames(vcov_mat) <- r_names
  list(est = est, vcov = vcov_mat)
}

# make some data
library(mvtnorm)
set.seed(1234)
r <- 0.4
N <- 100
X <- as.data.frame(rmvnorm(N, sigma = r + diag(1 - r, 4)))
names(X) <- c("BMI","Dep-Q","Dep-I","Dep-F")

# bootstrapped vcov matrix
boot_cor(X)

# compare to Equation-based approach
source("
https://gist.githubusercontent.com/wviechtb/700983ab0bde94bed7c645fce770f8e9/raw/9ac23945a42f3549f616cc99f3e77de3e8360b10/rmat.r
")
rmat(cor(X), n = nrow(X), rtoz = TRUE)

James

On Sun, Jul 12, 2020 at 12:21 PM Viechtbauer, Wolfgang (SP) <
wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:

> Dear Bernard,
>
> What James was suggesting is to compute the correlations that you want to
> meta-analyze and their covariances and then meta-analyze the correlations
> using a multivariate model. To compute the covariances, you can either use
> equations provided by Steiger (1980) (and others) or use bootstrapping. For
> the latter, you would use the raw data, so your attempts to use
> bootstrapping on the correlations is headed in the wrong direction.
>
> Let me focus on the first approach. In particular, see this:
>
> https://gist.github.com/wviechtb/700983ab0bde94bed7c645fce770f8e9
>
> This function will compute the covariances using the 'equation approach'.
> So, for each study, first construct the full correlation matrix. Let's say
> a study used 3 different depression measures plus one way of measuring BMI.
> Then there is a 4x4 matrix of correlations, which includes 3 correlations
> between depression and BMI (which is what you are interested in) but also 3
> pairwise correlations between the depression measures. Give proper row and
> column names to this matrix. For example:
>
> rownames(Rmat) <- colnames(Rmat) <- c("BMI", "SR", "INT", "MF")
>
> (SR = self-report, INT = interviews, MD = medical file)
>
> Pass this correlation matrix to the rmat() function, plus the sample size.
> I would also recommend rtoz=TRUE. Then you will get back the 6 r-to-z
> transformed correlations ($dat) and the 6x6 var-cov matrix of those r-to-z
> transformed correlations ($V).
>
> Do this for all 7 studies. Make sure you use the same row and column
> naming scheme across studies. If a study only included, for example, SR and
> INT, then you have a 3x3 matrix.
>
> Combine $dat from all 7 studies into one big data frame ('dat'). Combine
> $V into one big block-diagonal matrix ('V') (bldiag() from metafor can be
> used for this).
>
> Then fit a multivariate model with:
>
> res <- rma.mv(yi, V, mods = ~ var1var2 - 1, random = ~ var1var2 | study,
> struct="UN", data=dat)
> res
>
> You will get pooled estimates of the r-to-z transformed correlations for
> all possible pairs. This will also include pairs that you are not
> interested in (e.g., for "SR" and "INT"). You are interested in the BMI.SR,
> BMI.INT, and BMI.MF correlations.
>
> You can now back-transform the pooled estimates to correlations using:
>
> predict(res, diag(6), transf=transf.ztor)
>
> You can also test if there are differences between the BMI.SR, BMI.INT,
> and BMI.MF correlations using anova(). For example:
>
> anova(res, L = rbind(c(1,-1,0,0,0,0), c(1,0,-1,0,0,0), c(0,1,-1,0,0,0))
>
> but you need to make sure that the position of the 1 and -1's actually
> correspond to the position of the 3 coefficients you are interested in. A
> global (2 df) test of those differences will be part of this output:
>
> anova(res, L = rbind(c(1,-1,0,0,0,0), c(1,0,-1,0,0,0))
>
> If there is no evidence for differences, you could consider collapsing the
> three depression measure levels ("SR", "INT", "MF") into one. For example:
>
> predict(res, c(1/3, 1/3, 1/3, 0, 0, 0), transf=transf.ztor)
>
> (again, I am assuming that the first three coefficients are the BMI.<dep
> measure> correlations, but check that first in your output). Or you could
> collapse the codings in 'dat' and refit the model.
>
> Best,
> Wolfgang
>
> >-----Original Message-----
> >From: R-sig-meta-analysis [mailto:
> r-sig-meta-analysis-bounces using r-project.org]
> >On Behalf Of Bernard Fernou
> >Sent: Sunday, 12 July, 2020 13:05
> >To: James Pustejovsky
> >Cc: r-sig-meta-analysis using r-project.org
> >Subject: Re: [R-meta] Differences between metafor / robumeta / lme4
> >
> >Dear James,
> >
> >Thank you very much for your prompt reply! We would like to determine
> >whether there is a crude association between BMI and depression symptoms
> in
> >patients with a neurodevelopmental disorder. We would like to draw a
> random
> >effects inference.
> >
> >We understand the interest in trying to capture the correlation between
> the
> >effect size estimates. I have read the chapter of Gleser & Olkin (2009)
> >but, since it did not describe formulas when correlations are used as ES,
> >we did not know how to move forward with our situation. This is why we
> have
> >adopted a RVE approach.
> >
> >I am very interested in trying to apply the two approaches you suggest. I
> >have tried to read a bit on bootstrapping in the context of meta-analysis
> >but that exceeded my (very basic) level of statistical knowledge.
> >
> >I found some resources on metafor's website on bootstrapping (
> >http://www.metafor-project.org/doku.php/analyses:viechtbauer2007a).
> >Although I managed to implement it on our data, I was not sure that this
> >approach is the one you recommended to us (essentially, I did not
> >understand how it provides information on the correlation between our ES).
> >Should the data.gen function be modified to integrate the correlation
> >between the effect sizes?
> >  data.gen <- function(dat, mle) {
> >    data.frame(yi=rnorm(nrow(dat), mle$mu,
> >                        sqrt(mle$tau2 + dat$vi)), vi=dat$vi)
> >  }
> >
> >If you have some resources that could guide us, it would be very
> >appreciated.
> >
> >Thank you so much for your invaluable help!
> >
> >Bernard
> >
> >Le sam. 11 juil. 2020 à 15:23, James Pustejovsky <jepusto using gmail.com> a
> >écrit :
> >
> >> Bernard,
> >> Could you clarify what is the goal of your analysis? What question(s)
> are
> >> you trying to answer? And in particular, are you aiming to draw a random
> >> effects inference (to a broader population of studies) or a fixed
> effects
> >> inference (limited to the set of studies in the analysis)?
> >>
> >> Generally, I would recommend using an approach that captures the
> >> correlations between the effect size estimates (i.e., the correlation
> >> between the estimated correlation coefficients). As far as I can tell,
> >none
> >> of the approaches you listed really does that. The robumeta and metafor
> >> models make arbitrary assumptions about the correlation, then use RVE to
> >> protect against this. But given that you have the raw data, you should
> be
> >> able to obtain all the information you would need to understand the
> >> correlations—either through the old Steiger 1980 formulas or (probably
> >> better?) by bootstrapping the set of ES estimates from each study.
> >>
> >> James
> >>
> >> > On Jul 11, 2020, at 6:15 AM, Bernard Fernou <bernard.fernou using gmail.com
> >
> >> wrote:
> >> >
> >> > Hi everyone,
> >> > I hope this email finds you well.
> >> >
> >> > I am trying to model the relationship between depression and BMI over
> >> > several studies conducted in our research lab. We have access to all
> >> > individual patient data.
> >> >
> >> > We have 7 seven studies in which BMI is always assessed using a
> >> > digital scale and a meter but in which we used various measures to
> >assess
> >> > depression (self reported questionnaire / interviews / medical file).
> >> Each
> >> > study included either 1, 2 or the 3 measures of depression.  The
> effect
> >> > size used is correlation.
> >> >
> >> > We have analyzed our data using 3 methods (2 two-stage approaches:
> robu
> >> > from robumeta / 3 level meta analysis with robust SE using metafor;
> and
> >1
> >> > one-stage approach: a mixed model using lme4).
> >> >
> >> > A complete reproducible example is provided below.
> >> >
> >> > Results from metafor and lme4 agreed (p values were not significant
> and
> >> > very similar: p = .13, .14). However, results from robumeta are quite
> >> > different (the p value is barely significant p = .03). I tried to use
> >> > several rho values in robumeta but it did not change anything. Note
> that
> >> > heterogeneity/inconsistency (assessed using tau² / I²) are very low.
> >> >
> >> > Unfortunately, as this meta analysis did not followed any systematic
> >> > review, we made the mistake to not pre register it and to directly
> >> analyze
> >> > the data. Therefore, we have not any proof of what analysis what
> thought
> >> as
> >> > primary. Even if the effet sizes are similar across 3 methods, it is
> >> > probable that readers will be sensitive to p-values. The choice of the
> >> > statistical analysis is thus critical.
> >> >
> >> > I know that one-stage and two stages approaches could differ because
> of
> >> the
> >> > model specification (Burke et al., 2017). Thus; I suspect that we
> missed
> >> a
> >> > specification differing between robumeta on the one side and both
> >> > metafor+lme4 on the other.
> >> >
> >> > I do not know if this mailing list is the right place to ask for help
> >but
> >> > if someone could give me a feedback on the code used and on the best
> >> > approach to analyze our data, it would be greatly appreciated!
> >> >
> >> > Best,
> >> >
> >> > Bernard
> >> >
> >> >
> >> > set.seed(1234)
> >> > BMI<-rnorm(1000, 23, 4)
> >> > Depression<-rnorm(1000, 10, 2)+0.1*BMI
> >> > cor.test(~BMI+Depression)
> >> > Study<-rep(1:6, times=c(100,100,100,300,200,200))
> >> > Outcome<-c(rep(c("Questionnaire", "Interview"),
> times=c(100,200)),#first
> >> 3
> >> > studies
> >> >           rep(c("Questionnaire", "Interview", "MedicalFile"),
> >> > each=100),#4th study
> >> >           rep(c("Questionnaire", "Interview"), each=100,
> times=2))#5&6th
> >> > studies
> >> >
> >> > df<-data.frame(cbind(BMI,Depression,Study,Outcome))
> >> >
> >> > #see the structure of the data
> >> > df %>%
> >> >  group_by(Study ,Outcome) %>%
> >> >  summarise(N=n())
> >> >
> >> > corFUNC=function(x){
> >> >  list(
> >> >    cor.test(~as.numeric(x$BMI)+as.numeric(x$Depression))$estimate,
> >> >
> (cor.test(~as.numeric(x$BMI)+as.numeric(x$Depression))$parameter+2))
> >> > }
> >> >
> >> > df.cor<-split(df, list(df$Study, df$Outcome), drop=TRUE)
> >> > lapply(df.cor,corFUNC)
> >> >
> >> > df.meta.1<-as.data.frame(do.call(rbind, lapply(df.cor,corFUNC)))
> >> >
> >> > df.meta<-df.meta.1 %>%
> >> >  dplyr::mutate(Study.Outcome=row.names(df.meta.1)) %>%
> >> >  tidyr::separate(Study.Outcome, into=c("Study", "Outcome")) %>%
> >> >  dplyr::rename("Cor"=V1,"N"=V2) %>%
> >> >  dplyr::arrange(Study) %>%
> >> >  dplyr::mutate(
> >> >    yi=yi<-escalc(measure="ZCOR", ri=as.numeric(Cor), ni=as.numeric(N),
> >> > data=df.meta)$yi,
> >> >    vi=yi<-escalc(measure="ZCOR", ri=as.numeric(Cor), ni=as.numeric(N),
> >> > data=df.meta)$vi)
> >> >
> >> > Approach1=robumeta::robu(
> >> >    formula = yi ~ 1,
> >> >    v=vi,
> >> >    studynum = Study,
> >> >    small = TRUE,
> >> >    rho=0.3,
> >> >    data = df.meta)
> >> >
> >> >  df$BMI<-as.numeric(df$BMI); df$Depression<-as.numeric(df$Depression)
> >> >
> >> >
> >>
>
> >Approach2=lme4::lmer(BMI~Depression+(1+Depression|Study)+(1+Depression|Outco
> >me:Study),
> >> > data=df)
> >> >
> >> >  Approach3=rma.mv(yi, vi, random = ~ 1 |Study/Outcome, data=df.meta)
> >> >  robust(Approach3, cluster = df.meta$Study)
>

	[[alternative HTML version deleted]]