[R-meta] Differences between metafor / robumeta / lme4

Sun Jul 12 19:20:50 CEST 2020

Dear Bernard,

What James was suggesting is to compute the correlations that you want to meta-analyze and their covariances and then meta-analyze the correlations using a multivariate model. To compute the covariances, you can either use equations provided by Steiger (1980) (and others) or use bootstrapping. For the latter, you would use the raw data, so your attempts to use bootstrapping on the correlations is headed in the wrong direction.

Let me focus on the first approach. In particular, see this:

https://gist.github.com/wviechtb/700983ab0bde94bed7c645fce770f8e9

This function will compute the covariances using the 'equation approach'. So, for each study, first construct the full correlation matrix. Let's say a study used 3 different depression measures plus one way of measuring BMI. Then there is a 4x4 matrix of correlations, which includes 3 correlations between depression and BMI (which is what you are interested in) but also 3 pairwise correlations between the depression measures. Give proper row and column names to this matrix. For example:

rownames(Rmat) <- colnames(Rmat) <- c("BMI", "SR", "INT", "MF")

(SR = self-report, INT = interviews, MD = medical file)

Pass this correlation matrix to the rmat() function, plus the sample size. I would also recommend rtoz=TRUE. Then you will get back the 6 r-to-z transformed correlations ($dat) and the 6x6 var-cov matrix of those r-to-z transformed correlations ($V).

Do this for all 7 studies. Make sure you use the same row and column naming scheme across studies. If a study only included, for example, SR and INT, then you have a 3x3 matrix.

Combine $dat from all 7 studies into one big data frame ('dat'). Combine $V into one big block-diagonal matrix ('V') (bldiag() from metafor can be used for this).

Then fit a multivariate model with:

res <- rma.mv(yi, V, mods = ~ var1var2 - 1, random = ~ var1var2 | study, struct="UN", data=dat)
res

You will get pooled estimates of the r-to-z transformed correlations for all possible pairs. This will also include pairs that you are not interested in (e.g., for "SR" and "INT"). You are interested in the BMI.SR, BMI.INT, and BMI.MF correlations.

You can now back-transform the pooled estimates to correlations using:

predict(res, diag(6), transf=transf.ztor)

You can also test if there are differences between the BMI.SR, BMI.INT, and BMI.MF correlations using anova(). For example:

anova(res, L = rbind(c(1,-1,0,0,0,0), c(1,0,-1,0,0,0), c(0,1,-1,0,0,0))

but you need to make sure that the position of the 1 and -1's actually correspond to the position of the 3 coefficients you are interested in. A global (2 df) test of those differences will be part of this output:

anova(res, L = rbind(c(1,-1,0,0,0,0), c(1,0,-1,0,0,0))

If there is no evidence for differences, you could consider collapsing the three depression measure levels ("SR", "INT", "MF") into one. For example:

predict(res, c(1/3, 1/3, 1/3, 0, 0, 0), transf=transf.ztor)

(again, I am assuming that the first three coefficients are the BMI.<dep measure> correlations, but check that first in your output). Or you could collapse the codings in 'dat' and refit the model.

Best,
Wolfgang

>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org]
>On Behalf Of Bernard Fernou
>Sent: Sunday, 12 July, 2020 13:05
>To: James Pustejovsky
>Cc: r-sig-meta-analysis using r-project.org
>Subject: Re: [R-meta] Differences between metafor / robumeta / lme4
>
>Dear James,
>
>Thank you very much for your prompt reply! We would like to determine
>whether there is a crude association between BMI and depression symptoms in
>patients with a neurodevelopmental disorder. We would like to draw a random
>effects inference.
>
>We understand the interest in trying to capture the correlation between the
>effect size estimates. I have read the chapter of Gleser & Olkin (2009)
>but, since it did not describe formulas when correlations are used as ES,
>we did not know how to move forward with our situation. This is why we have
>adopted a RVE approach.
>
>I am very interested in trying to apply the two approaches you suggest. I
>have tried to read a bit on bootstrapping in the context of meta-analysis
>but that exceeded my (very basic) level of statistical knowledge.
>
>I found some resources on metafor's website on bootstrapping (
>http://www.metafor-project.org/doku.php/analyses:viechtbauer2007a).
>Although I managed to implement it on our data, I was not sure that this
>approach is the one you recommended to us (essentially, I did not
>understand how it provides information on the correlation between our ES).
>Should the data.gen function be modified to integrate the correlation
>between the effect sizes?
>  data.gen <- function(dat, mle) {
>    data.frame(yi=rnorm(nrow(dat), mle$mu,
>                        sqrt(mle$tau2 + dat$vi)), vi=dat$vi)
>  }
>
>If you have some resources that could guide us, it would be very
>appreciated.
>
>Thank you so much for your invaluable help!
>
>Bernard
>
>Le sam. 11 juil. 2020 à 15:23, James Pustejovsky <jepusto using gmail.com> a
>écrit :
>
>> Bernard,
>> Could you clarify what is the goal of your analysis? What question(s) are
>> you trying to answer? And in particular, are you aiming to draw a random
>> effects inference (to a broader population of studies) or a fixed effects
>> inference (limited to the set of studies in the analysis)?
>>
>> Generally, I would recommend using an approach that captures the
>> correlations between the effect size estimates (i.e., the correlation
>> between the estimated correlation coefficients). As far as I can tell,
>none
>> of the approaches you listed really does that. The robumeta and metafor
>> models make arbitrary assumptions about the correlation, then use RVE to
>> protect against this. But given that you have the raw data, you should be
>> able to obtain all the information you would need to understand the
>> correlations—either through the old Steiger 1980 formulas or (probably
>> better?) by bootstrapping the set of ES estimates from each study.
>>
>> James
>>
>> > On Jul 11, 2020, at 6:15 AM, Bernard Fernou <bernard.fernou using gmail.com>
>> wrote:
>> >
>> > Hi everyone,
>> > I hope this email finds you well.
>> >
>> > I am trying to model the relationship between depression and BMI over
>> > several studies conducted in our research lab. We have access to all
>> > individual patient data.
>> >
>> > We have 7 seven studies in which BMI is always assessed using a
>> > digital scale and a meter but in which we used various measures to
>assess
>> > depression (self reported questionnaire / interviews / medical file).
>> Each
>> > study included either 1, 2 or the 3 measures of depression.  The effect
>> > size used is correlation.
>> >
>> > We have analyzed our data using 3 methods (2 two-stage approaches: robu
>> > from robumeta / 3 level meta analysis with robust SE using metafor; and
>1
>> > one-stage approach: a mixed model using lme4).
>> >
>> > A complete reproducible example is provided below.
>> >
>> > Results from metafor and lme4 agreed (p values were not significant and
>> > very similar: p = .13, .14). However, results from robumeta are quite
>> > different (the p value is barely significant p = .03). I tried to use
>> > several rho values in robumeta but it did not change anything. Note that
>> > heterogeneity/inconsistency (assessed using tau² / I²) are very low.
>> >
>> > Unfortunately, as this meta analysis did not followed any systematic
>> > review, we made the mistake to not pre register it and to directly
>> analyze
>> > the data. Therefore, we have not any proof of what analysis what thought
>> as
>> > primary. Even if the effet sizes are similar across 3 methods, it is
>> > probable that readers will be sensitive to p-values. The choice of the
>> > statistical analysis is thus critical.
>> >
>> > I know that one-stage and two stages approaches could differ because of
>> the
>> > model specification (Burke et al., 2017). Thus; I suspect that we missed
>> a
>> > specification differing between robumeta on the one side and both
>> > metafor+lme4 on the other.
>> >
>> > I do not know if this mailing list is the right place to ask for help
>but
>> > if someone could give me a feedback on the code used and on the best
>> > approach to analyze our data, it would be greatly appreciated!
>> >
>> > Best,
>> >
>> > Bernard
>> >
>> >
>> > set.seed(1234)
>> > BMI<-rnorm(1000, 23, 4)
>> > Depression<-rnorm(1000, 10, 2)+0.1*BMI
>> > cor.test(~BMI+Depression)
>> > Study<-rep(1:6, times=c(100,100,100,300,200,200))
>> > Outcome<-c(rep(c("Questionnaire", "Interview"), times=c(100,200)),#first
>> 3
>> > studies
>> >           rep(c("Questionnaire", "Interview", "MedicalFile"),
>> > each=100),#4th study
>> >           rep(c("Questionnaire", "Interview"), each=100, times=2))#5&6th
>> > studies
>> >
>> > df<-data.frame(cbind(BMI,Depression,Study,Outcome))
>> >
>> > #see the structure of the data
>> > df %>%
>> >  group_by(Study ,Outcome) %>%
>> >  summarise(N=n())
>> >
>> > corFUNC=function(x){
>> >  list(
>> >    cor.test(~as.numeric(x$BMI)+as.numeric(x$Depression))$estimate,
>> >    (cor.test(~as.numeric(x$BMI)+as.numeric(x$Depression))$parameter+2))
>> > }
>> >
>> > df.cor<-split(df, list(df$Study, df$Outcome), drop=TRUE)
>> > lapply(df.cor,corFUNC)
>> >
>> > df.meta.1<-as.data.frame(do.call(rbind, lapply(df.cor,corFUNC)))
>> >
>> > df.meta<-df.meta.1 %>%
>> >  dplyr::mutate(Study.Outcome=row.names(df.meta.1)) %>%
>> >  tidyr::separate(Study.Outcome, into=c("Study", "Outcome")) %>%
>> >  dplyr::rename("Cor"=V1,"N"=V2) %>%
>> >  dplyr::arrange(Study) %>%
>> >  dplyr::mutate(
>> >    yi=yi<-escalc(measure="ZCOR", ri=as.numeric(Cor), ni=as.numeric(N),
>> > data=df.meta)$yi,
>> >    vi=yi<-escalc(measure="ZCOR", ri=as.numeric(Cor), ni=as.numeric(N),
>> > data=df.meta)$vi)
>> >
>> > Approach1=robumeta::robu(
>> >    formula = yi ~ 1,
>> >    v=vi,
>> >    studynum = Study,
>> >    small = TRUE,
>> >    rho=0.3,
>> >    data = df.meta)
>> >
>> >  df$BMI<-as.numeric(df$BMI); df$Depression<-as.numeric(df$Depression)
>> >
>> >
>>
>Approach2=lme4::lmer(BMI~Depression+(1+Depression|Study)+(1+Depression|Outco
>me:Study),
>> > data=df)
>> >
>> >  Approach3=rma.mv(yi, vi, random = ~ 1 |Study/Outcome, data=df.meta)
>> >  robust(Approach3, cluster = df.meta$Study)