[R-meta] extracting data from multi-way ANOVAs

Wed Oct 17 13:18:23 CEST 2018

Dear Ilya,

The equation |d| = sqrt(F*(nt + nc)/(nt*nc)) does work for computing a standardized mean difference when doing a one-way ANOVA and the F-value has df=1 in the numerator. An example:

dat <- data.frame(grp = c(1,1,1,1,1,1,2,2,2,2),
                  foo = c(1,1,2,2,3,3,1,1,2,3),
                  y   = c(2,4,3,2,3,4,6,4,3,4))

write.table(dat, file="data.dat", row.names=FALSE, quote=FALSE)

### one-way ANOVA

res <- summary(aov(y ~ factor(grp), data=dat))
res

### compute |d| = sqrt(F * (n1 + n2) / (n1 * n2)

n1 <- sum(dat$grp == 1)
n2 <- sum(dat$grp == 2)
sqrt(res[[1]]$F[1] * (n1 + n2) / (n1 * n2))

### compute d manually (same value)

m1  <- mean(dat$y[dat$grp == 1])
m2  <- mean(dat$y[dat$grp == 2])
sd1 <- sd(dat$y[dat$grp == 1])
sd2 <- sd(dat$y[dat$grp == 2])
(m1 - m2) / sqrt(((n1-1) * sd1^2 + (n2-1) * sd2^2) / (n1 + n2 - 2))

For other types of ANOVA models (even if the F-value you are using is based on df=1 in the numerator), this will not give you the same value. For example:

### two-way ANOVA (with Type I SS)

res <- summary(aov(y ~ factor(grp) * factor(foo), data=dat))
res
sqrt(res[[1]]$F[1] * (n1 + n2) / (n1 * n2)) # not right

One can actually correct for this:

F.grp  <- res[[1]]$F[1]
F.foo  <- res[[1]]$F[2]
F.int  <- res[[1]]$F[3]
df.foo <- res[[1]]$Df[2]
df.int <- res[[1]]$Df[3]
df.err <- res[[1]]$Df[4]

F <- F.grp * (df.err + df.foo + df.int) / (df.err + df.foo * F.foo + df.int * F.int)
F # same F-value as for one-way ANOVA

So, if you know the dfs and F-values of the other factors, then one can 'recover' the F-value as if one had fitted a one-way ANOVA and then one can go from F to d (but of course we lose the sign).

The equation/method above is based on:

Morris, S. B., & DeShon, R. P. (1997). Correcting effect sizes computed from factorial analysis of variance for use in meta-analysis. Psychological Methods, 2, 192-199.

Note that this works for a 'standard' two-way ANOVA (and can be extended to three-way and higher-way ANOVAs), but not for repeated-measures or mixed-effects models.

Also note that this works for 'Type I Sums of Squares', which is what R computes by default and the factor of interest must be entererd first into the model. This correction does not work for Type III tests, which a lot of other software uses by default. One might be able to work out the correction for that case as well, but I haven't attempted to do so.

One could also try to develop appropriate methods for converting 1-df F-values to d values for other types of cases, but that would require quite a bit of thinking. Could you maybe try to contact the authors and ask them to just give you the means, SDs, and group sizes for the two groups that you are interested in?

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Ilya Fischhoff
Sent: Tuesday, 16 October, 2018 15:29
To: r-sig-meta-analysis using r-project.org
Subject: [R-meta] extracting data from multi-way ANOVAs

Hello everyone,

I'm new to this listserv, and excited to learn about it. My apologies
if this question has been addressed before and I arrived too late to
read it.

I'm working on a meta-analysis for which I would like to use data as
reported in (repeated measures) multi-way ANOVAs or mixed effects
models. In these papers, means and variances of groups are not
available. For papers reporting results of one-way ANOVAs, I've been
using this formula relating the absolute value of Cohen's d to F ratio
and sample sizes: |d| = sqrt(F*(nt + nc)/(nt*nc)), where nt and nc are
treatment and control sample sizes. (Source: Koricheva, J., et al.
(2013). Handbook of meta-analysis in ecology and evolution, Princeton
University Press. p. 200.)

Does this same formula apply to the F ratio and sample sizes in a two-
or three-way ANOVA, or a linear mixed effects model?

I noticed, using synthetic data, that the F value resulting from a
one-way ANOVA differ from the F value for the same factor in a two-way
ANOVA. This gave me pause in applying the same formula to multi-way
ANOVAs. A typical paper reports, for each variable and interactions in
a two- or three-way ANOVA or linear mixed effects model, the degrees
of freedom, F value, and P value. Here is an example of two-way ANOVA
results from a paper that examined effects of food and infection on
NH4 release by Daphnia (aquatic organisms):

Food C:P ratio
F[2,42] = 0.044, P = 0.96

Infection
F[1,42] = 1.92, P = 0.17

Food X Infection
F[2,42] = 2.59, P = 0.088

(Source: Narr, C. F. and P. C. Frost (2015). "Does infection tilt the
scales? Disease effects on the mass balance of an invertebrate
nutrient recycler." Oecologia 179(4): 969-979.)

For this meta-analysis we are interested in the effect of infection,
not in the other factor (food) or their interaction. In the
meta-analysis, we are analyzing absolute values of d, so we do not
need to know the direction of the effect. Some papers we find also
report MS and SS, and/or test statistics for the error term.

We have been using equations reported in this paper
(https://www.bmj.com/content/343/bmj.d2090) to estimate variance in d
based on the P value.

I'd really appreciate feedback (and references, if possible) on
whether the formula at the top can be applied to multi-way ANOVAs or
mixed effects models. If not, guidance on correct equations for
estimating d would be much appreciated! It would be fantastic to be
able to use data from many more papers.

Thank you for reading and for any advice!
Best,
Ilya