[R-meta] Differences between metafor / robumeta / lme4

Mon Jul 13 13:25:00 CEST 2020

Dear Wolfgang, James,

First of all, thank you so much for your help. I guess I'm neither the
first nor the last to say this, but receiving such quality support is
amazing.

I strictly followed your recommendations. The use of the equation-based
approach vs. bootstrapping did not change the results much but the
bootstrapping approach was slightly more conservative. With your advice, I
can analyze the data for this study. However, I would like to ask you one
last question to make sure that I understand everything I am doing so that
I can reproduce this type of analysis on my own in the future.

I am not sure to understand the interest of including the correlation with
the BMI in the variance-covariance matrix. Why not simply include the
correlations between the 3 measures of depression (and having thus the 10
effect sizes of interest and a 10x10 variance-covariance matrix instead of
having 15 effect sizes (5 are not of interest) and a 15x15 matrix).

I compared the differences in results when using 10 effect sizes and a
10x10 matrix vs. 15 effect sizes and a 15x15 matrix. The p-values changed
greatly depending on the approach used when trying to obtain the overall
association between BMI and depression (but this may be due to an error).
However, the effect sizes for each outcome were similar.

Let's say that V is the 15x15 variance-covariance matrix, V2 is the 10x10
variance-covariance matrix. dat is a dataset with 15 ES and dat2 a dataset
with the 10 ES of interest.

A) I think that when using 15 ES and a 15x15 variance covariance matrix,
the correct approach to obtain the global test of the relationship
BMI-depression is the joint test of our associations of interest using the
anova command.

res <- rma.mv(yi, V, mods = ~ var1var2 - 1, random = ~ var1var2 | study,
struct="UN", data=dat)

anova(res, L=rbind(c(0,0,1,0,0,0),c(0,0,0,0,1,0),c(0,0,0,0,0,1))) #the 1
are placed at each outcome-BMI position. The omnibus p-value was .30

However, I do not understand how to obtain the global effect size and its
95% CI using this method (I guess that global tau² can be computed by
combining (somehow) tau² values from the res output; I found the way to do
so for 2 outcomes in the Berkey1998 example on the metafor’s website).

B) In a second step, I have tried to obtain this global effect size by
constructing a dataset with the 10 ES of interest + a 10x10 covariance
matrix and by running a model without specifying any moderator.

res2 <- rma.mv(yi, V2, random = ~ var1var2 | study, struct="UN", data=dat2)
# the overall p-value was .06

I have then tried to include moderators to this model to see whether effect
sizes for each outcome were similar from those estimated previously using
the 15x15 matrix. Indeed, effect sizes were very similar.

res3 <- rma.mv(yi, V2, mods = ~ var1var2 - 1,  random = ~ var1var2 | study,
struct="UN", data=dat2)

These results are exactly identical as running the first approach and then
subsetting the dataset to the associations of interest

res4 <- rma.mv(yi, V, mods = ~ var1var2 - 1, random = ~ var1var2 | study,
struct="UN", data=dat, subset=dat$associations.of.interest)

I guess that the first approach is far better since - if I understand it
well - it allows to have a general model that could be then decomposed
(which may be particularly useful if one has a set of studies with multiple
treatments and multiple outcomes for example?). However, from this general
model, I am not sure how to decompose it in order to obtain the global
effect of interest in my case.

Thank you again for your help

Best

Bernard

Le dim. 12 juil. 2020 à 21:09, James Pustejovsky <jepusto using gmail.com> a
écrit :

> Bernard,
>
> Wolfgang read my mind, as usual. Below are some functions and an example
> of how to do the first, bootstrap-based approach. The result of boot_cor()
> is essentially the same as Wolfgang's rmat function, but it uses a basic
> bootstrap to get the variance covariance matrix of the correlation
> estimates. My understanding is that the equation-based approach involves
> approximations that assume multivariate normality and a sufficiently large
> sample size within a given study. The basic bootstrap approach does not
> rely on multivariate normality (although it does require a sufficiently
> large sample size), and so it might be preferred due to this robustness
> property.
>
> As Wolfgang explained, whichever function you use, you'll need to apply it
> to each sample within your meta-analysis, then stack up the datasets and
> variance-covariance matrices for use in rma.mv().
>
> James
>
> # calculate (z-transformed) correlations, as a vector
>
> cor_vec <- function(data, Ztrans = TRUE) {
>   r_mat <- cor(data)
>   r_vec <- r_mat[lower.tri(r_mat)]
>
>   if (Ztrans) return(atanh(r_vec)) else return(r_vec)
> }
>
> cor_vec(X)
>
>
> # bootstrap vcov matrix
>
> boot_cor <- function(data, boots = 1999, Ztrans = TRUE) {
>   est <- cor_vec(data, Ztrans = Ztrans)
>   boot_ests <- replicate(n = boots, {
>     boot_index <- sample.int(nrow(data), replace = TRUE)
>     cor_vec(data[boot_index,], Ztrans = Ztrans)
>   }, simplify = FALSE)
>   vcov_mat <- cov(do.call(rbind, boot_ests))
>
>   rcols <- combn(names(X),2)
>   r_names <- apply(rcols, 2, paste, collapse = ":")
>   est <- data.frame(est = est, var1 = rcols[1,], var2 = rcols[2,],
> var1var2 = r_names, stringsAsFactors = FALSE)
>   colnames(vcov_mat) <- rownames(vcov_mat) <- r_names
>   list(est = est, vcov = vcov_mat)
> }
>
>
> # make some data
> library(mvtnorm)
> set.seed(1234)
> r <- 0.4
> N <- 100
> X <- as.data.frame(rmvnorm(N, sigma = r + diag(1 - r, 4)))
> names(X) <- c("BMI","Dep-Q","Dep-I","Dep-F")
>
> # bootstrapped vcov matrix
> boot_cor(X)
>
> # compare to Equation-based approach
> source("
> https://gist.githubusercontent.com/wviechtb/700983ab0bde94bed7c645fce770f8e9/raw/9ac23945a42f3549f616cc99f3e77de3e8360b10/rmat.r
> ")
> rmat(cor(X), n = nrow(X), rtoz = TRUE)
>
> James
>
> On Sun, Jul 12, 2020 at 12:21 PM Viechtbauer, Wolfgang (SP) <
> wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>
>> Dear Bernard,
>>
>> What James was suggesting is to compute the correlations that you want to
>> meta-analyze and their covariances and then meta-analyze the correlations
>> using a multivariate model. To compute the covariances, you can either use
>> equations provided by Steiger (1980) (and others) or use bootstrapping. For
>> the latter, you would use the raw data, so your attempts to use
>> bootstrapping on the correlations is headed in the wrong direction.
>>
>> Let me focus on the first approach. In particular, see this:
>>
>> https://gist.github.com/wviechtb/700983ab0bde94bed7c645fce770f8e9
>>
>> This function will compute the covariances using the 'equation approach'.
>> So, for each study, first construct the full correlation matrix. Let's say
>> a study used 3 different depression measures plus one way of measuring BMI.
>> Then there is a 4x4 matrix of correlations, which includes 3 correlations
>> between depression and BMI (which is what you are interested in) but also 3
>> pairwise correlations between the depression measures. Give proper row and
>> column names to this matrix. For example:
>>
>> rownames(Rmat) <- colnames(Rmat) <- c("BMI", "SR", "INT", "MF")
>>
>> (SR = self-report, INT = interviews, MD = medical file)
>>
>> Pass this correlation matrix to the rmat() function, plus the sample
>> size. I would also recommend rtoz=TRUE. Then you will get back the 6 r-to-z
>> transformed correlations ($dat) and the 6x6 var-cov matrix of those r-to-z
>> transformed correlations ($V).
>>
>> Do this for all 7 studies. Make sure you use the same row and column
>> naming scheme across studies. If a study only included, for example, SR and
>> INT, then you have a 3x3 matrix.
>>
>> Combine $dat from all 7 studies into one big data frame ('dat'). Combine
>> $V into one big block-diagonal matrix ('V') (bldiag() from metafor can be
>> used for this).
>>
>> Then fit a multivariate model with:
>>
>> res <- rma.mv(yi, V, mods = ~ var1var2 - 1, random = ~ var1var2 | study,
>> struct="UN", data=dat)
>> res
>>
>> You will get pooled estimates of the r-to-z transformed correlations for
>> all possible pairs. This will also include pairs that you are not
>> interested in (e.g., for "SR" and "INT"). You are interested in the
>> BMI.SR, BMI.INT, and BMI.MF correlations.
>>
>> You can now back-transform the pooled estimates to correlations using:
>>
>> predict(res, diag(6), transf=transf.ztor)
>>
>> You can also test if there are differences between the BMI.SR, BMI.INT,
>> and BMI.MF correlations using anova(). For example:
>>
>> anova(res, L = rbind(c(1,-1,0,0,0,0), c(1,0,-1,0,0,0), c(0,1,-1,0,0,0))
>>
>> but you need to make sure that the position of the 1 and -1's actually
>> correspond to the position of the 3 coefficients you are interested in. A
>> global (2 df) test of those differences will be part of this output:
>>
>> anova(res, L = rbind(c(1,-1,0,0,0,0), c(1,0,-1,0,0,0))
>>
>> If there is no evidence for differences, you could consider collapsing
>> the three depression measure levels ("SR", "INT", "MF") into one. For
>> example:
>>
>> predict(res, c(1/3, 1/3, 1/3, 0, 0, 0), transf=transf.ztor)
>>
>> (again, I am assuming that the first three coefficients are the BMI.<dep
>> measure> correlations, but check that first in your output). Or you could
>> collapse the codings in 'dat' and refit the model.
>>
>> Best,
>> Wolfgang
>>
>> >-----Original Message-----
>> >From: R-sig-meta-analysis [mailto:
>> r-sig-meta-analysis-bounces using r-project.org]
>> >On Behalf Of Bernard Fernou
>> >Sent: Sunday, 12 July, 2020 13:05
>> >To: James Pustejovsky
>> >Cc: r-sig-meta-analysis using r-project.org
>> >Subject: Re: [R-meta] Differences between metafor / robumeta / lme4
>> >
>> >Dear James,
>> >
>> >Thank you very much for your prompt reply! We would like to determine
>> >whether there is a crude association between BMI and depression symptoms
>> in
>> >patients with a neurodevelopmental disorder. We would like to draw a
>> random
>> >effects inference.
>> >
>> >We understand the interest in trying to capture the correlation between
>> the
>> >effect size estimates. I have read the chapter of Gleser & Olkin (2009)
>> >but, since it did not describe formulas when correlations are used as ES,
>> >we did not know how to move forward with our situation. This is why we
>> have
>> >adopted a RVE approach.
>> >
>> >I am very interested in trying to apply the two approaches you suggest. I
>> >have tried to read a bit on bootstrapping in the context of meta-analysis
>> >but that exceeded my (very basic) level of statistical knowledge.
>> >
>> >I found some resources on metafor's website on bootstrapping (
>> >http://www.metafor-project.org/doku.php/analyses:viechtbauer2007a).
>> >Although I managed to implement it on our data, I was not sure that this
>> >approach is the one you recommended to us (essentially, I did not
>> >understand how it provides information on the correlation between our
>> ES).
>> >Should the data.gen function be modified to integrate the correlation
>> >between the effect sizes?
>> >  data.gen <- function(dat, mle) {
>> >    data.frame(yi=rnorm(nrow(dat), mle$mu,
>> >                        sqrt(mle$tau2 + dat$vi)), vi=dat$vi)
>> >  }
>> >
>> >If you have some resources that could guide us, it would be very
>> >appreciated.
>> >
>> >Thank you so much for your invaluable help!
>> >
>> >Bernard
>> >
>> >Le sam. 11 juil. 2020 à 15:23, James Pustejovsky <jepusto using gmail.com> a
>> >écrit :
>> >
>> >> Bernard,
>> >> Could you clarify what is the goal of your analysis? What question(s)
>> are
>> >> you trying to answer? And in particular, are you aiming to draw a
>> random
>> >> effects inference (to a broader population of studies) or a fixed
>> effects
>> >> inference (limited to the set of studies in the analysis)?
>> >>
>> >> Generally, I would recommend using an approach that captures the
>> >> correlations between the effect size estimates (i.e., the correlation
>> >> between the estimated correlation coefficients). As far as I can tell,
>> >none
>> >> of the approaches you listed really does that. The robumeta and metafor
>> >> models make arbitrary assumptions about the correlation, then use RVE
>> to
>> >> protect against this. But given that you have the raw data, you should
>> be
>> >> able to obtain all the information you would need to understand the
>> >> correlations—either through the old Steiger 1980 formulas or (probably
>> >> better?) by bootstrapping the set of ES estimates from each study.
>> >>
>> >> James
>> >>
>> >> > On Jul 11, 2020, at 6:15 AM, Bernard Fernou <
>> bernard.fernou using gmail.com>
>> >> wrote:
>> >> >
>> >> > Hi everyone,
>> >> > I hope this email finds you well.
>> >> >
>> >> > I am trying to model the relationship between depression and BMI over
>> >> > several studies conducted in our research lab. We have access to all
>> >> > individual patient data.
>> >> >
>> >> > We have 7 seven studies in which BMI is always assessed using a
>> >> > digital scale and a meter but in which we used various measures to
>> >assess
>> >> > depression (self reported questionnaire / interviews / medical file).
>> >> Each
>> >> > study included either 1, 2 or the 3 measures of depression.  The
>> effect
>> >> > size used is correlation.
>> >> >
>> >> > We have analyzed our data using 3 methods (2 two-stage approaches:
>> robu
>> >> > from robumeta / 3 level meta analysis with robust SE using metafor;
>> and
>> >1
>> >> > one-stage approach: a mixed model using lme4).
>> >> >
>> >> > A complete reproducible example is provided below.
>> >> >
>> >> > Results from metafor and lme4 agreed (p values were not significant
>> and
>> >> > very similar: p = .13, .14). However, results from robumeta are quite
>> >> > different (the p value is barely significant p = .03). I tried to use
>> >> > several rho values in robumeta but it did not change anything. Note
>> that
>> >> > heterogeneity/inconsistency (assessed using tau² / I²) are very low.
>> >> >
>> >> > Unfortunately, as this meta analysis did not followed any systematic
>> >> > review, we made the mistake to not pre register it and to directly
>> >> analyze
>> >> > the data. Therefore, we have not any proof of what analysis what
>> thought
>> >> as
>> >> > primary. Even if the effet sizes are similar across 3 methods, it is
>> >> > probable that readers will be sensitive to p-values. The choice of
>> the
>> >> > statistical analysis is thus critical.
>> >> >
>> >> > I know that one-stage and two stages approaches could differ because
>> of
>> >> the
>> >> > model specification (Burke et al., 2017). Thus; I suspect that we
>> missed
>> >> a
>> >> > specification differing between robumeta on the one side and both
>> >> > metafor+lme4 on the other.
>> >> >
>> >> > I do not know if this mailing list is the right place to ask for help
>> >but
>> >> > if someone could give me a feedback on the code used and on the best
>> >> > approach to analyze our data, it would be greatly appreciated!
>> >> >
>> >> > Best,
>> >> >
>> >> > Bernard
>> >> >
>> >> >
>> >> > set.seed(1234)
>> >> > BMI<-rnorm(1000, 23, 4)
>> >> > Depression<-rnorm(1000, 10, 2)+0.1*BMI
>> >> > cor.test(~BMI+Depression)
>> >> > Study<-rep(1:6, times=c(100,100,100,300,200,200))
>> >> > Outcome<-c(rep(c("Questionnaire", "Interview"),
>> times=c(100,200)),#first
>> >> 3
>> >> > studies
>> >> >           rep(c("Questionnaire", "Interview", "MedicalFile"),
>> >> > each=100),#4th study
>> >> >           rep(c("Questionnaire", "Interview"), each=100,
>> times=2))#5&6th
>> >> > studies
>> >> >
>> >> > df<-data.frame(cbind(BMI,Depression,Study,Outcome))
>> >> >
>> >> > #see the structure of the data
>> >> > df %>%
>> >> >  group_by(Study ,Outcome) %>%
>> >> >  summarise(N=n())
>> >> >
>> >> > corFUNC=function(x){
>> >> >  list(
>> >> >    cor.test(~as.numeric(x$BMI)+as.numeric(x$Depression))$estimate,
>> >> >
>> (cor.test(~as.numeric(x$BMI)+as.numeric(x$Depression))$parameter+2))
>> >> > }
>> >> >
>> >> > df.cor<-split(df, list(df$Study, df$Outcome), drop=TRUE)
>> >> > lapply(df.cor,corFUNC)
>> >> >
>> >> > df.meta.1<-as.data.frame(do.call(rbind, lapply(df.cor,corFUNC)))
>> >> >
>> >> > df.meta<-df.meta.1 %>%
>> >> >  dplyr::mutate(Study.Outcome=row.names(df.meta.1)) %>%
>> >> >  tidyr::separate(Study.Outcome, into=c("Study", "Outcome")) %>%
>> >> >  dplyr::rename("Cor"=V1,"N"=V2) %>%
>> >> >  dplyr::arrange(Study) %>%
>> >> >  dplyr::mutate(
>> >> >    yi=yi<-escalc(measure="ZCOR", ri=as.numeric(Cor),
>> ni=as.numeric(N),
>> >> > data=df.meta)$yi,
>> >> >    vi=yi<-escalc(measure="ZCOR", ri=as.numeric(Cor),
>> ni=as.numeric(N),
>> >> > data=df.meta)$vi)
>> >> >
>> >> > Approach1=robumeta::robu(
>> >> >    formula = yi ~ 1,
>> >> >    v=vi,
>> >> >    studynum = Study,
>> >> >    small = TRUE,
>> >> >    rho=0.3,
>> >> >    data = df.meta)
>> >> >
>> >> >  df$BMI<-as.numeric(df$BMI); df$Depression<-as.numeric(df$Depression)
>> >> >
>> >> >
>> >>
>>
>> >Approach2=lme4::lmer(BMI~Depression+(1+Depression|Study)+(1+Depression|Outco
>> >me:Study),
>> >> > data=df)
>> >> >
>> >> >  Approach3=rma.mv(yi, vi, random = ~ 1 |Study/Outcome, data=df.meta)
>> >> >  robust(Approach3, cluster = df.meta$Study)
>>
>

	[[alternative HTML version deleted]]