[R-sig-ME] Dependence of Effect Sizes in meta analysis metafor (some statistical theory questions included)

Sat Jun 27 10:26:28 CEST 2015

These are already some excellent responses. Let me add a few comments of my own.

Brian mentioned the Konstantopoulos (2011) reanalysis (http://www.metafor-project.org/doku.php/analyses:konstantopoulos2011) but it is also useful to carefully examine the two pages linked to (http://www.metafor-project.org/doku.php/analyses:berkey1998 and http://www.metafor-project.org/doku.php/analyses:gleser2009). These discuss some methods/models when the *sampling errors* cannot be assumed to be independent.

In fact, I think there is a lot of confusion out there in the literature as to what type of methods/models are needed in multivariate/multilevel-type of situations when conducting a meta-analysis. To be more specific, when the same sample is used to obtain two (or more) effect size estimates (either because the same outcome is measured at two measurement occasions or because two different outcomes are measured), then the sampling errors of the estimates are likely to be dependent. I would emphasize that this dependence pertains to the *sampling errors* of the estimates -- and just like we can derive equations to compute (or rather, estimate) the variance of the sampling error for various effect size measures (e.g., log odds/risk ratios, raw or standardized mean differences, and so on), we can derive the necessary equations to compute/estimate the covariance of the sampling errors. This is what Gleser and Olkin (2009) describe at great length in their chapter (http://www.metafor-project.org/doku.php/analyses:gleser2009).

Unless the measurements over time or for multiple outcomes can be assumed to be independent (which would be very unlikely in most situations), it is a mathematical/statistical fact that the sampling errors for multiple estimates derived based on those measurements are going to be dependent as well.

Mike also nicely described that there can be another dependence as well, namely that of the underlying true effects. That is, the underlying true effects corresponding to the estimates arising from the same study/sample may be more similar to each other than those arising from different samples/studies (i.e., they may be correlated). Whether this is really the case or not is an empirical question and needs to be examined by estimating the covariance/correlation between the true effects (by means of an appropriate multilevel/multivariate model). This is illustrated/discussed, for example, in the paper by Berkey et al. (1998) (http://www.metafor-project.org/doku.php/analyses:berkey1998).

In fact, this type dependence may also be present when the sampling errors can be safely assumed to be uncorrelated within a study. For example, the two effect size estimates may pertain to the group of men and women within the study -- since the two samples do not overlap, the sampling errors can be assumed to be independent. Or multiple estimates may be available based on different studies reported in the same paper, based on completely different samples. This type of sitation is in fact what is described in the Konstantopoulos (2011) paper (http://www.metafor-project.org/doku.php/analyses:konstantopoulos2011).

However, it seems to me that there is a general misconception that it is sufficient to only consider/deal with the second time of dependence (i.e., the possible correlation among the underlying true effects) -- and the dependence in the sampling errors is ignored. There may be a rather pragmatic reason for that: Computing/estimating the covariance of the sampling errors typically requires information that is often not reported in the studies (e.g., the correlation among the measurements across multiple measurement occasions or the correlation among the different outcomes). That is, however, not a good/appropriate reason to ignore this dependence.

As Mike mentioned, there may be situations where ignoring this dependence may still lead to valid inferences, but the decision to ignore this dependence should then be made based on a very thorough consideration of all aspects of the analysis (data, model, assumptions, ...) and not be done routinely.

If the covariance among the sampling errors cannot be computed (even after contacting study authors to obtain the missing information needed), there are several options:

1) One can still often make a rough/educated guess how large correlations (or whatever else is needed to compute the covariances) are. Then one uses those 'guestimates' and conduct sensitivity analyses to ensure that conclusions remain unchanged when the values are varied within a reasonable range.

2) Theodore mentioned using robust methods -- in essence, we then consider the assumed variance-covariance matrix to be misspecified (i.e., we assume it is diagonal, when in fact we know it isn't) and then estimate the variance-covariance matrix of the fixed effects (which are typically of primary interest) using consistent methods even under such a model misspecification.

3) Resampling methods (i.e., bootstrapping and permutation testing) may also work.

4) There are also some alternative models that try to circumvent the problem by means of some simplification of the model. Specifically, in the model/approach by Riley and colleagues (see, for example: Riley, R. D., Abrams, K. R., Lambert, P. C., Sutton, A. J., & Thompson, J. R. (2007). An evaluation of bivariate random-effects meta-analysis for the joint synthesis of two correlated outcomes. Statistics in Medicine, 26(1), 78-97.), we assume that the correlation among the sampling errors is identical to the correlation among the underlying true effects, and then we just estimate that one correlation. This can work, but whether it does depends on how well that simplification matches up with reality.

5) There is always another option: Avoid any kind of statistical dependence via data reduction (e.g., selecting only one estimate, conducting separate analyses for different outcomes). This is still the most commonly used approach to 'handling' the problem, because it allows practitioners to stick to (relatively simple) models/methods/software they are already familiar with. But this approach can be wasteful and limits inference (e.g., if we conduct two separate meta-analyses for outcomes A and B, we cannot test whether the estimated effect is different for A and B unless we can again properly account for their covariance).

P.S.: Brian -- in your rma.mv() call, you used 'mods = TypeSE'. I would suggest to use 'mods = ~ TypeSE' (you can make use of formula syntax for the 'mods' argument).

Best,
Wolfgang

-- 
Wolfgang Viechtbauer, Ph.D., Statistician | Department of Psychiatry and    
Neuropsychology | Maastricht University | P.O. Box 616 (VIJV1) | 6200 MD    
Maastricht, The Netherlands | +31 (43) 388-4170 | http://www.wvbauer.com    

> -----Original Message-----
> From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
> project.org] On Behalf Of Mike Cheung
> Sent: Saturday, June 27, 2015 09:25
> To: Theodore Lytras
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Dependence of Effect Sizes in meta analysis
> metafor (some statistical theory questions included)
> 
> Hello,
> 
>  There are several types of dependence in a meta-analysis. The effect
> sizes
> can be conditionally dependent because the same samples (participants)
> are
> used in the analysis. There are formulas to estimate the amount the
> dependence (sampling covariances among effect sizes) when enough summary
> statistics are given. The second type of dependence is the covariances
> among the true effect sizes at the population level. Both of them are
> assumed in a multivariate random-effects meta-analysis. If the later
> (covariances among the true effect sizes) is assumed zero, it becomes a
> multivariate fixed-effects meta-analysis.
> 
>  A third type of dependence happens when there are multiple effect sizes
> reported by the same study. One typical issue is that we may not have
> enough information to estimate the sampling covariances among effect
> sizes.
> Thus, the multivariate meta-analysis cannot be used. If we take the
> assumptions that (1) the amount of dependence is random and (2) the
> effect
> sizes are conditionally independent after controlling for the random
> effects, we may use a three-level meta-analysis to model it. This is
> basically what the note means.
> 
>  Some researchers further suggested to use the three-level meta-analysis
> to
> conduct the multivariate meta-analysis because there is no need to
> calculate the conditional sampling covariances among the effect sizes.
> Under some assumptions (see the following links), this approach works. On
> the other hand, the three-level meta-analysis is a special case of the
> multivariate meta-analysis by imposing a few constraints. Since the
> multivariate and three-level meta-analyses are related, I would suggest
> studying both of them at the same time and see which one fits better for
> your data and research questions.
> 
>  The followings are some excerpts from my book that are related to my
> points.
> 
> https://books.google.com.sg/books?id=sp3TBgAAQBAJ&pg=PA121&dq=5.1.1+Types
> +of+dependence&hl=en&sa=X&ei=YUmOVfWKHpOGuASOsbTABg&redir_esc=y#v=onepage
> &q=5.1.1%20Types%20of%20dependence&f=false
> 
> https://books.google.com.sg/books?id=sp3TBgAAQBAJ&pg=PA195&dq=6.4+Relatio
> nship+between+the+multivariate+and+the+three-level+meta-
> analyses&hl=en&sa=X&ei=bEiOVZibL8ytuQS_moGwCg&redir_esc=y#v=onepage&q=6.4
> %20Relationship%20between%20the%20multivariate%20and%20the%20three-
> level%20meta-analyses&f=false
> 
>  Regards,
> Mike
> 
> --
> ---------------------------------------------------------------------
>  Mike W.L. Cheung               Phone: (65) 6516-3702
>  Department of Psychology       Fax:   (65) 6773-1843
>  National University of Singapore
>  http://courses.nus.edu.sg/course/psycwlm/internet/
> ---------------------------------------------------------------------
> 
> On Sat, Jun 27, 2015 at 2:50 PM, Theodore Lytras <thlytras at gmail.com>
> wrote:
> 
> > Στις Σαβ 27 Ιουν 2015 06:05:39 Drwecki, Brian B έγραψε:
> > > Hello all,
> > >
> > > I apologize for the long post, but I want to be thorough.
> > >
> > > My Goal: To conduct the appropriate mixed-effects (random effects
> meta
> > > analysis model+ one fixed effects moderator with two categorical
> levels)
> > > meta analysis where 11 of 38 papers/studies present effects for both
> > levels
> > > of my Fixed Effects moderator (i.e. these 11 studies provide 2 effect
> > sizes
> > > each =  22 total estimates; each pair of effects is dependent and
> > violates
> > > assumptions of independence).
> > [snip, snip]
> >
> > Hello,
> >
> > Maybe you can check package "robumeta" and the associated paper:
> >
> > Hedges LV, Tipton E, Johnson MC. Robust variance estimation in
> > meta-regression
> > with dependent effect size estimates. Res Synth Method. 2010;1(1):39–
> 65.
> > http://onlinelibrary.wiley.com/doi/10.1002/jrsm.5/abstract
> >
> > I've recently dealt with a similar meta-analysis situation, with
> > hierarchical
> > dependence (multiple effect estimates clustered within the same study),
> and
> > this approach worked well for me.
> >
> > As a plus, you can have "robumeta" play nice ball with "metafor":
> >
> > http://blogs.edb.utexas.edu/pusto/2014/04/21/a-meta-sandwich/
> >
> > Hope this helps!
> >
> > Kind regards,
> >
> > Theodore Lytras