# [R-meta] calculate effect size and variance for prepost proportion data

Viechtbauer, Wolfgang (NP) wo||g@ng@v|echtb@uer @end|ng |rom m@@@tr|chtun|ver@|ty@n|
Fri Feb 17 22:13:03 CET 2023

```I liked the idea of specifing the 'joint proportion' that I implemented this as an alternative. So, in the 'devel' version of metafor, you can now do:

escalc(measure="MPORM", ai=30+15, bi=5+20, ci=30+5, di=15+20, pi=B, digits=8)

to get the same result. Of course, in practice, when only the marginal counts are known, one has to guestimate that proportion (or correlation).

It is also worth noting that the marginal counts put constraints on the possible values for ri and pi. If a specified value for ri or pi is not feasible under a given table, the corresponding sampling variance will be NA.

Best,
Wolfgang

>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On
>Behalf Of Viechtbauer, Wolfgang (NP) via R-sig-meta-analysis
>Sent: Friday, 17 February, 2023 20:17
>To: r-sig-meta-analysis using r-project.org
>Cc: Viechtbauer, Wolfgang (NP)
>Subject: Re: [R-meta] calculate effect size and variance for prepost proportion
>data
>
>Just a late follow-up to this:
>
>This is actually implemented in escalc() in the metafor package. See:
>
>https://wviechtb.github.io/metafor/reference/escalc.html#-b-measures-for-
>dichotomous-variables-2
>
>References are also given there.
>
>To illustrate:
>
>library(metafor)
>
>escalc(measure="MPOR", ai=30, bi=15,
>                       ci= 5, di=20, digits=8)
>
># James' formulas give the same results
>N <- 30 + 15 + 5 + 20
>P7 <- (30 + 15) / N
>P6 <- (30 +  5) / N
>B <- 30 / N
>log(P7 / (1 - P7)) - log(P6 / (1 - P6))
>(P6 * (1 - P6) + P7 * (1 - P7) - 2 * (B - P6 * P7)) / (N * P6 * (1 - P6) * P7 *
>(1 - P7))
>
># alternatively, one can specify the pre-post correlation (phi coefficient)
>ri <- (30*20 - 15*5) / sqrt((30+15) * (5+20) * (30+5) * (15+20))
>escalc(measure="MPORM", ai=30+15, bi=5+20, ci=30+5, di=15+20, ri=ri, digits=8)
>
># this is useful if one just has the 'marginal' counts and one needs to
># guestimate the correlation
>
># show that the variance of the regular OR is the same as assuming ri=0
>escalc(measure="OR", ai=30+15, bi=5+20, ci=30+5, di=15+20, digits=8)
>escalc(measure="MPORM", ai=30+15, bi=5+20, ci=30+5, di=15+20, ri=0, digits=8)
>
>Best,
>Wolfgang
>
>>-----Original Message-----
>>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On
>>Behalf Of James Pustejovsky
>>Sent: Monday, 30 January, 2023 16:50
>>To: Liu Sicong
>>Cc: r-sig-meta-analysis using r-project.org
>>Subject: Re: [R-meta] calculate effect size and variance for prepost proportion
>>data
>>
>>Hi Sicong,
>>
>>Responses below.
>>
>>James
>>
>>On Sun, Jan 29, 2023 at 6:30 AM Liu Sicong <64zone using gmail.com> wrote:
>>
>>> Hi James,
>>>
>>> I would like to ask two follow-up questions regarding the V(LOR) formula
>>> you kindly suggested previously (the relevant part has been attached below
>>> inside ###--- for your convenience)
>>>
>>>    - Q1: what would you suggest to do when N is different between 6th
>>>    (pre )and 7th (post) grade? Perhaps use the average of Npre and Npost?
>>>
>>This is tricky. The statistically correct answer depends on the number of
>>events among the participants who have data at both time-points, which is
>>probably not reported very often in practice. As an ad hoc approach, my
>>first thought would be to use a) the minimum of N-pre and N-post or b) the
>>harmonic mean of N-pre and N-post, N-harmonic = 2 / (1 / N-pre + 1 /
>>N-post).
>>
>>>    - Q2: if we assume outcomes are independent (i.e., B = P6*P7), would
>>>    the situation becomes similar to computing V(LOR) for between-condition
>>>    effect sizes of proportion outcomes? In such a case, would the V(LOR)
>>>    formula you suggested be mathematically related to the one for
>>>    between-condition effect sizes (i.e., V(LOR) = 1/A + 1/B + 1/C + 1/D, where
>>>    ABCD represents the number participants in the 2*2 table)?
>>
>>Yes, if you assume independence, then the covariance term will drop out and
>>you'll be left with
>>
>>V(LOR) = [P6 (1 - P6) + P7 (1 - P7)] / [N P6 (1 - P6) P7 (1 - P7)]
>>= 1 / [N P6 (1 - P6)] + 1 / [N P7 (1 - P7)]
>>= 1 / (N P6) + 1 / (N (1 - P6)) + 1 / (N P7) + 1 / (N (1 - P7))
>>
>>> Say that that the outcome is school suspension (at any time) during 6th
>>> grade (pre) and 7th grade (post). Let P6 be the overall proportion of
>>> students suspended during 6th grade, P7 be the overall proportion of
>>> students suspended during 7th grade, and B be the proportion of students
>>> suspended during both 6th and 7th grades. Let N be the total sample size
>>> (which I'm assuming to be the same at both time points). The pre-post LOR is
>>>
>>> LOR = log[P7 / (1 - P7)] - log[P6 / (1 - P6)]
>>>
>>> And an estimate of its sampling variance is
>>> V(LOR) = [P6 (1 - P6) + P7 (1 - P7) - 2 (B - P6 * P7)] / [N P6 (1 - P6) P7
>>> (1 - P7)]
>>>
>>> As you can see, you'll need to know B to compute this. If this is not
>>> reported, you could use a conservative estimate (i.e., a probable
>>> over-estimate) of the sampling variance based on the assumption that the
>>> outcomes are independent (in which case B = P6 * P7 and the last term in
>>> the numerator drops out) but I'm not sure how useful that would be in your
>>> application.
>>>
>>> ### ---
>>>
>>> ------------------------------------------
>>> Sicong (Zone) Liu, Ph.D.
>>> Research Associate
>>> University of Pennsylvania
>>>
>>> 3620 Walnut Street,
>>> ------------------------------------------
>>>
>>> *From: *James Pustejovsky <jepusto using gmail.com>
>>> *Date: *Wednesday, January 4, 2023 at 11:28 PM
>>> *To: *Sicong Liu <64zone using gmail.com>
>>> *Cc: *"r-sig-meta-analysis using r-project.org" <
>>> r-sig-meta-analysis using r-project.org>
>>> *Subject: *Re: [R-meta] calculate effect size and variance for prepost
>>> proportion data
>>>
>>> Yes, if you transform from LOR to d by taking sqrt(3 / pi) * LOR, then you
>>> would multiply V(LOR) by 3 / pi.
>>>
>>> On Wed, Jan 4, 2023 at 7:34 PM Liu Sicong <64zone using gmail.com> wrote:
>>>
>>> Thank you for clarifying James!
>>>
>>> Just one follow-up question:
>>>
>>>    - If I would like to transform the V(LOR) to Cohen’s d metric, does
>>>    “V(LOR) * 3/Pi” still work? Thank you!
>>>
>>> Cheers,
>>>
>>> Zone
>>>
>>> -------------
>>> *From: *James Pustejovsky <jepusto using gmail.com>
>>> *Date: *Tuesday, January 3, 2023 at 3:57 PM
>>> *To: *Sicong Liu <64zone using gmail.com>
>>> *Cc: *"r-sig-meta-analysis using r-project.org" <
>>> r-sig-meta-analysis using r-project.org>
>>> *Subject: *Re: [R-meta] calculate effect size and variance for prepost
>>> proportion data
>>>
>>> Hi Zone,
>>>
>>> I have not been able to find a reference for the pre-post log odds ratio
>>> in particular. I derived the formula using the delta method (same as Wei
>>> and Higgins) and the properties of the multinomial distribution.
>>>
>>> Perhaps others on the list know of a reference?
>>>
>>> James
>>>
>>> On Tue, Jan 3, 2023 at 2:13 PM Liu Sicong <64zone using gmail.com> wrote:
>>>
>>> Happy 2023 and thank you for your response, James!
>>>
>>> I wonder if you could point me to the reference of the formulas raised,
>>> especially the V(LOR) one? I checked Wei and Higgins (2013) but did not
>>> find such a formula explicitly expressed in the paper. Perhaps the V(LOR)
>>> is derived from their general method? Please let me know.
>>>
>>> Cheers,
>>> Zone
>>>
>>> -------------
>>>
>>> *From: *James Pustejovsky <jepusto using gmail.com>
>>> *Date: *Tuesday, January 3, 2023 at 10:43 AM
>>> *To: *Sicong Liu <64zone using gmail.com>
>>> *Cc: *"r-sig-meta-analysis using r-project.org" <
>>> r-sig-meta-analysis using r-project.org>
>>> *Subject: *Re: [R-meta] calculate effect size and variance for prepost
>>> proportion data
>>>
>>> Hi Zone,
>>>
>>> I think it is less common to use pre-post effect size measures with binary
>>> outcomes. In principle, it can be done, but my sense is that there is less
>>> benefit (in terms of precision improvement) from using a binary pre-test
>>> than there is from accounting for pre-tests with continuous outcomes.
>>>
>>> Wei and Higgins (2013; https://doi.org/10.1002/sim.5679) discuss the
>>> covariance between log odds ratios computed for different binary outcomes,
>>> which is closely related to the case you're looking at. In order to get an
>>> accurate estimate of the sampling variance of the pre-post log odds ratio,
>>> you will need to know the correlation between the pre-test outcome and the
>>> post-test outcome or, equivalently, the number of participants with the
>>> positive outcome at both pre-test and post-test.
>>>
>>> Say that that the outcome is school suspension (at any time) during 6th
>>> grade (pre) and 7th grade (post). Let P6 be the overall proportion of
>>> students suspended during 6th grade, P7 be the overall proportion of
>>> students suspended during 7th grade, and B be the proportion of students
>>> suspended during both 6th and 7th grades. Let N be the total sample size
>>> (which I'm assuming to be the same at both time points). The pre-post LOR is
>>>
>>> LOR = log[P7 / (1 - P7)] - log[P6 / (1 - P6)]
>>>
>>> And an estimate of its sampling variance is
>>>
>>> V(LOR) = [P6 (1 - P6) + P7 (1 - P7) - 2 (B - P6 * P7)] / [N P6 (1 - P6) P7
>>> (1 - P7)]
>>>
>>> As you can see, you'll need to know B to compute this. If this is not
>>> reported, you could use a conservative estimate (i.e., a probable
>>> over-estimate) of the sampling variance based on the assumption that the
>>> outcomes are independent (in which case B = P6 * P7 and the last term in
>>> the numerator drops out) but I'm not sure how useful that would be in your
>>> application.
>>>
>>> James
>>>
>>> On Mon, Jan 2, 2023 at 7:40 AM Liu Sicong <64zone using gmail.com> wrote:
>>>
>>> Happy 2023 All!
>>>
>>> I have some prepost proportion data. For instance, some clinical trials
>>> may intervene on patients’ vaccine uptake and report the proportion of
>>> patients who received the vaccine both prior to and after interventions. So
>>> I may have the following data
>>>
>>>   *   Outcomes in proportion: p_control_pre, p_control_post,
>>> p_experiment_pre, p_experiment_post
>>>   *   Sample sizes: n_control_pre, n_control_post, n_experiment_pre,
>>> n_experiment_post
>>>
>>> I am clear about how to calculate between-condition effect sizes and
>>> variances in the following manner. For instance, those for comparing the
>>> conditions at posttest would be:
>>>
>>>   *   Effect size: ln((p_experiment_post/(1 -
>>> p_experiment_post))/(p_control_post/(1 - p_control_post)))
>>>   *   Variance of effect size: 1/(n_experiement_post*p_experiment_post) +
>>> 1/(n_experiement_post*(1-p_experiment_post)) +
>>> 1/(n_control_post*p_control_post) + 1/(n_control_post*(1-p_control_post))
>>>
>>> My question is about how to calculate the effect size and its variance
>>> when I am also interested in within-condition growth. For instance, how to
>>> represent the prepost growth due to vaccination intervention for the
>>> experimental group? Perhaps even before asking this question, would it be
>>> reasonable to attempt the computation of such effect sizes and variances?
>>> Thank you very much!
>>>
>>> Best regards,
>>> Sicong (Zone)
>>>
>>> ------------------------------------------
>>> Sicong (Zone) Liu, Ph.D.
>>> Research Associate
>>> University of Pennsylvania
>>>
>>> 3620 Walnut Street,