[R-meta] Chi-square or F-test to test for subgroup heterogeneity

Tue Oct 9 22:31:10 CEST 2018

I agree with Wolfgang's assessment of the potential small-sample
corrections in this situation. Satterthwaite or Kenward-Roger corrections
should provide better type-I error control than z- or chi-squared tests,
but I do not know of readily available tools for doing these calculations
with rma.mv models. (Any students reading this, don't look now but a
dissertation topic just fell into your lap!) However, I believe that
Kenward-Roger is available in SAS. Partlett and Riley have examined the
performance of KR corrections for univariate random effects meta-analysis,
and Owens and Ferron have examined it for multi-level meta-analysis (in the
context of meta-analysis of single-case designs). I do not know of research
that has examined KR for multi-variate meta-analysis.

In the absence of available tools, I think that it would be acceptable (and
likely conservative) to use degrees of freedom for a t- or F- test equal to
the number of independent clusters of effect sizes (typically the number of
studies), minus the number of predictors in the model that vary between
clusters. For example, say that you have a total of 20 studies, and you are
testing for differences in average effect sizes across four categories
(e.g., four regions). Then take the df to be 20 - 4 = 16.

Another option would be to use tests based on robust variance estimation.
There are robust versions of t- and F- tests that incorporate small-sample
corrections and provide good type-I error control in relatively small
samples. The tests are available in the clubSandwich package: t-tests using
coef_test(), F-tests using Wald_test(). The F-tests tend to get
conservative if you are testing a large number of moderators jointly (e.g.,
testing equality among 6+ different categories). The drawback of this
approach is that it is likely to be less powerful than using model-based
variance estimates because it is based on a weaker set of assumptions. For
instance, suppose again that you are testing for equality among four
categories, A through D. The model-based variance estimator typically
assumes that the between-study heterogeneity is constant across categories,
whereas the robust variance estimator allows for different levels of
heterogeneity in category A, category B, category C, and category D.

James

Owens, C. M., & Ferron, J. M. (2012). Synthesizing single-case studies: A
Monte Carlo examination of a three-level meta-analytic model. *Behavior
Research Methods*, *44*(3), 795-805.

Partlett, C., & Riley, R. D. (2017). Random effects meta‐analysis: coverage
performance of 95% confidence and prediction intervals following REML
estimation. *Statistics in medicine*, *36*(2), 301-317.

On Tue, Oct 9, 2018 at 10:18 AM Viechtbauer, Wolfgang (SP) <
wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:

> Hi Ty,
>
> If we want to be picky, neither test="z" nor test="t" in rma.mv() is
> really justifiable. Using z- and chi-square tests ignores the uncertainty
> in the estimated variance components and can lead to inflated Type I error
> rates (but also overly conservative rates when there is very little or no
> heterogeneity).
>
> Using test="t" naively uses t- and F-tests with degrees of freedom equal
> to p and k-p dfs (where k is the total number of estimates and p the total
> number of model coefficients), but this is really an ad-hoc method -- that
> may indeed provide somewhat better control of the Type I error rates (at
> least when there is inflated to begin with), but again, the use of t- and
> F-distributions isn't properly motivated and the computation of the dfs is
> overly simplistic.
>
> The Knapp & Hartung method that is available for rma.uni() with
> test="knha" not only uses t- and F-tests, but also adjusts the standard
> errors in such a way that one actually gets t- and F-distributions under
> the null (technically, there is some fudging also involved in the K&H
> method, but numerous simulation studies have shown that this appears to be
> a non-issue).
>
> Unfortunately, test="knha" is not (currently) available for rma.mv(). A
> generalization of the K&H method to 'rma.mv' models is possible, but I
> have not implemented this so far, because further research is needed to
> determine if this is really useful.
>
> Another route would be to use t- and F-distribution, but then a
> Satterthwaite approximation to the dfs. I have examined this for rma.uni()
> models, but this appears to be overly conservative, especially under low
> heterogeneity. For moderate to large heterogeneity, this does appear to
> work though. Further research is also needed here to determine how well
> this would work for 'rma.mv' models. Also, working out how to implement
> this in general for 'rma.mv' models isn't trivial. The same applies to
> the method by Kenward and Roger.
>
> Maybe James (Pustejovsky) can also chime in here, since, together with
> Elizabeth Tipton, he has done some work on this topic when using
> cluster-robust inference methods.
>
> Best,
> Wolfgang
>
> -----Original Message-----
> From: R-sig-meta-analysis [mailto:
> r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Ty Beal
> Sent: Friday, 05 October, 2018 21:03
> To: r-sig-meta-analysis using r-project.org
> Subject: [R-meta] Chi-square or F-test to test for subgroup heterogeneity
>
> Hi all,
>
> I estimated mean frequency of consumption as well as prevalence of
> less-than-daily fruit and vegetable consumption, at-least-daily carbonated
> beverage consumption, and at-least-weekly fast food consumption among
> school-going adolescents aged primarily 12-17 years from Africa, Asia,
> Oceania, and Latin America between 2008 and 2015. Random-effects
> meta-analysis was used to pool estimates globally and by WHO region, World
> Bank income group, and food system typology.
>
> To keep things simple, I will just ask about region. There are 5 regions
> included in the analysis. I would like to first test whether there is
> significant heterogeneity between all regions (omnibus test), and if so
> then do pairwise tests between specific regions. I am using rma.mv() with
> mods as the 5 regions and want to know whether I should use the default “z”
> statistic, which for the omnibus test is based on a chi-square distribution
> or “t”, which for the omnibus test is based on the F-distribution.
>
> Best,
>
> Ty Beal, PhD
> Technical Specialist
> Knowledge Leadership
>
> GAIN – Global Alliance for Improved Nutrition
> 1509 16th Street NW, 7th Floor | Washington, DC 20036
> tbeal using gainhealth.org<mailto:atumilowicz using gainhealth.org>
> C: +1 (602) 481-5211
> Skype: tyroniousbeal
> [GAINbanner]<http://www.gainhealth.org>
> _______________________________________________
> R-sig-meta-analysis mailing list
> R-sig-meta-analysis using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>

	[[alternative HTML version deleted]]