[R-meta] Correcting Hedges' g vs. Log response ratio in nested studies

Yuhang Hu yh342 @end|ng |rom n@u@edu
Fri Nov 3 01:32:09 CET 2023


Thanks so  much, James. This is how I got it set up. Do I have it right?

dat <- read.table(header=TRUE,text="
study  gi   vi  cluster n1  n2
1     .2   .05  T       25  23
1     .3   .08  T       18  11
2     1    .1   F       19  21
2     2    .2   F       12  36")

g_cluster <- \(gi, n1, n2, icc=.15){

  n <- mean(c(n1,n2), na.rm=TRUE)
  N <- sum(c(n1,n2), na.rm=TRUE)
  gi*sqrt( 1-((2*(n-1)*icc)/(N-2)) )
}
library(dplyr)
  group_by(dat , study) %>%
  mutate(gi= ifelse(cluster, g_cluster(gi,n1,n2),  gi))

On Thu, Nov 2, 2023 at 3:22 PM James Pustejovsky <jepusto using gmail.com> wrote:

> This correction applies to a single effect size estimate from a given
> study. All of these values (n, N, n1, n2) are therefore specific to the
> study and need to be recorded for each row in a meta-analysis database.
>
> On Nov 2, 2023, at 5:14 PM, Yuhang Hu <yh342 using nau.edu> wrote:
>
> 
> Sure, I thought N is n1 + n2  which is unique to each row in the dataset.
>
> But it looks like I should compute N as n * m where "n" is (average of
> n1, n2) for each row in the data but "m" is constant across all rows in the
> dataset.
>
> Thanks,
> Yuhang
>
> On Thu, Nov 2, 2023 at 2:16 PM James Pustejovsky <jepusto using gmail.com>
> wrote:
>
>> Total sample size is the same thing as the average sample size per
>> cluster times the number of clusters. My previous message is just a
>> restatement of the formula to show how it is related to the number of
>> clusters.
>>
>> On Thu, Nov 2, 2023 at 4:11 PM Yuhang Hu <yh342 using nau.edu> wrote:
>>
>>> Hi James,
>>>
>>> If you look at Eq. number E.5.1 on p1 of this document: (
>>> https://ies.ed.gov/ncee/wwc/Docs/referenceresources/WWC-41-Supplement-508_09212020.pdf)
>>> they define the correction factor as: sqrt( 1-((2*(n-1)*icc)/(N-2)) )
>>> where N is n1 + n2 (total sample size), and n as s the average number
>>> of individuals per cluster.
>>>
>>> Am I missing something? Or is the correction factor linked above from
>>> WWC inaccurate?
>>>
>>> Thank you,
>>> Yuhang
>>>
>>> On Thu, Nov 2, 2023 at 1:51 PM James Pustejovsky <jepusto using gmail.com>
>>> wrote:
>>>
>>>> Responses inline below.
>>>>
>>>> On Thu, Nov 2, 2023 at 3:30 PM Yuhang Hu <yh342 using nau.edu> wrote:
>>>>
>>>>> Regarding your first message, it looks like the correction factor for
>>>>> SMD is: sqrt( 1-((2*(n-1)*icc)/(N-2)) ) where n is the average cluster size
>>>>> for each comparison in a study, and N is the sum of the two groups' sample
>>>>> sizes. So, I wonder how the number of clusters is impacting the correction
>>>>> factor for SMD as you indicated?
>>>>>
>>>>> N = n * m, where m is the number of clusters. So the correction factor
>>>> is
>>>> sqrt( 1-((2*(n-1)*icc)/(m * n - 2)) ~=  sqrt( 1- 2 * icc /m)
>>>>
>>>>
>>>>> Regarding my initial question, my hunch was that for SMD, the SMD
>>>>> estimate and its sampling variance are (non-linearly) related to one
>>>>> another. Therefore, correcting the sampling variance for a design issue
>>>>> will necessitate correcting the SDM estimate as well.
>>>>>
>>>>> On the other hand, the LRR estimate and its sampling variance are not
>>>>> as much related to one another. Therefore, correcting the sampling variance
>>>>> for a design issue will not necessitate correcting the LRR estimate as well.
>>>>>
>>>>>
>>>> No, the issue you've described here is pretty much unrelated to the
>>>> bias correction problem.
>>>>
>>>>
>>>>> On Thu, Nov 2, 2023 at 8:41 AM James Pustejovsky <jepusto using gmail.com>
>>>>> wrote:
>>>>>
>>>>>> One other thought on this question, for the extra-nerdy.
>>>>>>
>>>>>> The formulas for the Hedges' g SMD estimator involve what
>>>>>> statisticians would call "second-order" bias corrections, meaning
>>>>>> corrections arising from having a limited sample size. In contrast, the
>>>>>> usual estimator of the LRR is just a "plug-in" estimator that works for
>>>>>> large sample sizes but can have small biases with limited sample sizes.
>>>>>> Lajeunesse (2015; https://doi.org/10.1890/14-2402.1) provides
>>>>>> formulas for the second-order bias correction of the LRR estimator with
>>>>>> independent samples. These bias correction formulas actually *would* need
>>>>>> to be different if you have clustered observations. So, the two effect size
>>>>>> metrics are maybe more similar than it initially seemed:
>>>>>> - Both metrics have plug-in estimators that are not really affected
>>>>>> by the dependence structure of the sample, but whose variance estimators do
>>>>>> need to take into account the dependence structure
>>>>>> - Both metrics have second-order corrected estimators, the exact form
>>>>>> for which does need to take into account the dependence structure.
>>>>>>
>>>>>> James
>>>>>>
>>>>>> On Thu, Nov 2, 2023 at 8:14 AM James Pustejovsky <jepusto using gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> Wolfgang is correct. The WWC correction factor arises because the
>>>>>>> sample variance is not quite unbiased as an estimator for the total
>>>>>>> population variance in a design with clusters of dependent observations,
>>>>>>> which leads to a small bias in the SMD.
>>>>>>>
>>>>>>> The thing is, though, this correction factor is usually negligible.
>>>>>>> Say you’ve got a clustered design with n = 21 kids per cluster and 20
>>>>>>> clusters, and an ICC of 0.2. Then the correction factor is going to be
>>>>>>> about 0.99 and so will make very little difference for the effect size
>>>>>>> estimate. It only starts to matter if you’re looking at studies with very
>>>>>>> few clusters and non-trivial ICCs.
>>>>>>>
>>>>>>> James
>>>>>>>
>>>>>>> > On Nov 2, 2023, at 3:04 AM, Viechtbauer, Wolfgang (NP) via
>>>>>>> R-sig-meta-analysis <r-sig-meta-analysis using r-project.org> wrote:
>>>>>>> > Dear Yuhang,
>>>>>>> >
>>>>>>> > I haven't looked deeply into this, but an immediate thought I have
>>>>>>> is that for SMDs, you divide by some measure of variability within the
>>>>>>> groups. If that measure of variability is affected by your study design,
>>>>>>> then this will also affect the SMD value. On the other hand, this doesn't
>>>>>>> have any impact on LRRs since they are only the (log-transformed) ratio of
>>>>>>> the means.
>>>>>>> >
>>>>>>> > Best,
>>>>>>> > Wolfgang
>>>>>>> >
>>>>>>> >> -----Original Message-----
>>>>>>> >> From: R-sig-meta-analysis <
>>>>>>> r-sig-meta-analysis-bounces using r-project.org> On Behalf
>>>>>>> >> Of Yuhang Hu via R-sig-meta-analysis
>>>>>>> >> Sent: Thursday, November 2, 2023 05:42
>>>>>>> >> To: R meta <r-sig-meta-analysis using r-project.org>
>>>>>>> >> Cc: Yuhang Hu <yh342 using nau.edu>
>>>>>>> >> Subject: [R-meta] Correcting Hedges' g vs. Log response ratio in
>>>>>>> nested studies
>>>>>>> >>
>>>>>>> >> Hello All,
>>>>>>> >>
>>>>>>> >> I know that when correcting Hedges' g (i.e., bias-corrected SMD,
>>>>>>> aka "g")
>>>>>>> >> in nested studies, we have to **BOTH** adjust our initial "g" and
>>>>>>> its
>>>>>>> >> sampling variance "vi_g"
>>>>>>> >> (
>>>>>>> https://ies.ed.gov/ncee/wwc/Docs/referenceresources/WWC-41-Supplement-
>>>>>>> >> 508_09212020.pdf).
>>>>>>> >>
>>>>>>> >> But when correcting Log Response Ratios (LRR) in nested studies,
>>>>>>> we have to
>>>>>>> >> **ONLY** adjust its initial sampling variance "vi_LRR" but not
>>>>>>> "LRR" itself
>>>>>>> >> (
>>>>>>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2021-October/003486.html
>>>>>>> ).
>>>>>>> >>
>>>>>>> >> I wonder why the two methods of correction differ for Hedge's g
>>>>>>> and LRR?
>>>>>>> >>
>>>>>>> >> Thanks,
>>>>>>> >> Yuhang
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > R-sig-meta-analysis mailing list @
>>>>>>> R-sig-meta-analysis using r-project.org
>>>>>>> > To manage your subscription to this mailing list, go to:
>>>>>>> > https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>>>>>>>
>>>>>>

	[[alternative HTML version deleted]]



More information about the R-sig-meta-analysis mailing list