[R-meta] Effect size calculation: Hedges g average / dependent groups

Thu Jan 11 08:54:34 CET 2024

Dear Pia,

Please see my responses below.

Best,
Wolfgang

> -----Original Message-----
> From: R-sig-meta-analysis <r-sig-meta-analysis-bounces using r-project.org> On Behalf
> Of Pia-Magdalena Schmidt via R-sig-meta-analysis
> Sent: Thursday, December 21, 2023 22:32
> To: r-sig-meta-analysis using r-project.org
> Cc: Pia-Magdalena Schmidt <pia-magdalena.schmidt using uni-bonn.de>
> Subject: [R-meta] Effect size calculation: Hedges g average / dependent groups
>
> Dear all,
>
> I would be very grateful if you could help me clarify my calculations.
>
> I am conducting a meta-analysis using metafor and want to calculate Hedges g
> av (see Lakens, 2013 - doi: 10.3389/fpsyg.2013.00863).
>
> [background: within-design with healthy participants investigating drug
> effects compared to placebo, repeated measures: 1x drug, 1x placebo (no pre-
> , post- treatment design); available data: m1i, m2i, sd1i, sd2i, n1i, n2i
> with n1i = n2i]
>
> 1. As I did not find a direct way to calculate Hedges g av in metafor, I
> calculated the sd (sd_average = (sd1i + sd2i)/2) manually and used “SMD1”,
> since this only requires one sd input and should otherwise be the same as
> “SMD”. Is that right?

In principle, yes (except the bias correction may not be quite right in this case, but this is a minor issue unless sample sizes are very small) and the sampling variance will not be computed correctly.

Cousineau, D. (2020). Approximating the distribution of Cohen's d_p in within-subject designs. The Quantitative Methods for Psychology, 16(4), 418-421. https://doi.org/10.20982/tqmp.16.4.p418

Cousineau, D., & Goulet-Pelletier, J.-C. (2021). A study of confidence intervals for Cohen's dp in within-subject designs with new proposals. The Quantitative Methods for Psychology, 17(1), 51-75. https://doi.org/10.20982/tqmp.17.1.p051

and

https://cran.r-project.org/package=CohensdpLibrary

for an R package that does the corresponding calculations.

> 2. To check my approach, I compared “SMD1” with sd2i = sd_pooled and
> compared the results to “SMD”. I expected identical effect sizes as I
> thought that “SMD1” with sd2i = sd_pooled should match “SMD”, but my
> calculations yielded different values for yi & vi.

There are two issues here. First of all, the average SD should be computed by averaging the variances and then taking the square root:

sd_average <- sqrt(1/2*(dat$`sd_drug`^2 + dat$`sd_plc`^2))

This is also what Cumming (2012) describes (eq. 11.9), so this is incorrect in Lakens (2013).

Also, the bias correction is different in these two cases. But you can switch off the correction with correct=FALSE. So:

es_smd1average = escalc(measure="SMD1",
   n1i = dat$`n_drug`,
   n2i = dat$`n_plc`,
   m1i = dat$`m_drug`,
   m2i = dat$`m_plc`,
   sd2i= sd_average,
   vtype ="UB", correct=FALSE)

es_smd= escalc(measure="SMD",
   n1i=dat$`n_drug`,
   n2i =dat$`n_plc`,
   m1i = dat$`m_drug`,
   m2i = dat$`m_plc`,
   sd1i = dat$`sd_drug`,
   sd2i = dat$`sd_plc`,
   vtype ="UB", correct=FALSE)

Then the point estimates will be the same, but the calculation of the sampling variances still differs. In any case, if you want to use d_av, check out the references and R package provided above.

> An example of my code and results:
>
> library(metafor)
>
> # create minimal data
> dat <- structure(list(id = c(1, 2, 3), n_drug = c(10, 20, 15), m_drug =
> c(435.4, 460, 404), sd_drug = c(60.2, 36.6, 58.09), n_plc = c(10, 20, 15),
> m_plc = c(493.7, 460, 474), sd_plc = c(89.9, 40.6, 44.93)), class =
> c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L))
>
> sd_average <- 1/2*(dat$`sd_drug` + dat$`sd_plc`)
> sd_pooled <- sqrt(((dat$`n_drug`-1)*dat$`sd_drug`^2 +
> (dat$`n_plc`-1)*dat$`sd_plc`^2) / (dat$`n_drug`+dat$`n_plc`-2))
>
> # --- Relating Question 1 ---
> # SMD1 with input sd_average
> es_smd1average = escalc(measure="SMD1",
>    n1i = dat$`n_drug`,
>    n2i = dat$`n_plc`,
>    m1i = dat$`m_drug`,
>    m2i = dat$`m_plc`,
>    sd2i= sd_average,
>    vtype ="UB")
>
> # --- Relating Question 2 ---
> # SMD1 with input sd_pooled
> es_smd1pooled= escalc(measure="SMD1",
>    n1i=dat$`n_drug`,
>    n2i =dat$`n_plc`,
>    m1i = dat$`m_drug`,
>    m2i = dat$`m_plc`,
>    sd2i= sd_pooled,
>    vtype ="UB")
>
> # SMD
> es_smd= escalc(measure="SMD",
>    n1i=dat$`n_drug`,
>    n2i =dat$`n_plc`,
>    m1i = dat$`m_drug`,
>    m2i = dat$`m_plc`,
>    sd1i = dat$`sd_drug`,
>    sd2i = dat$`sd_plc`,
>    vtype ="UB")
>
> > print(es_smd1pooled)
>
>    yi vi
> 1 -0.6964 0.2333
> 2 0.0000 0.1000
> 3 -1.2743 0.1995
>
> > print(es_smd)
>
>    yi vi
> 1 -0.7298 0.2164
> 2 0.0000 0.1000
> 3 -1.3115 0.1661
>
> I thought about using “SMCC” instead but ri, t-statistics and p-values are
> (often) unknown.
>
> Do you have any comments?
>
> Many thanks in advance!
> Best,
> Pia