[R-meta] Effect size and variance from synthetic control studies

Fri Jan 29 10:19:51 CET 2021

Dear Dario,

I am going to set aside all the additional intricacies you outlined further below and focus on the following question: How can one compute a standardized mean difference when n=1 in one of the two groups?

The usual equation for d is:

d = (m1 - m2) / sdp

where sdp is the pooled SD of the two groups. This assumes that the true SDs are the same in the two groups. For the n=1 groups, the observed SD is 0 by definition (strictly speaking, it's actually not even computable if we divide by n-1 in the equation for the SD), but that is not a sensible estimate of the true SD. Hence, pooling the two SDs together makes no sense. So, instead, one could use:

d = (m1 - m2) / sd2

where m1 is the 'mean' of the n=1 group (i.e., the one observed value), m2 is the mean of the group with multiple observations (n2 of them), and sd2 is the SD from that group.

The bias correction is then:

g = (1-3/(4*(n2-1)-1)) * d.

The variance of g can then be estimated with:

Var(g) = 1 + 1/n2 + g^2/(2*n2)

Best,
Wolfgang

>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org]
>On Behalf Of Dario Schulz
>Sent: Friday, 22 January, 2021 17:02
>To: r-sig-meta-analysis using r-project.org
>Subject: [R-meta] Effect size and variance from synthetic control studies
>
>Hello there,
>
>I have two questions, the context is explained below.
>
>First, I'd like to know whether Borenstein's 2009 formula for the variance
>of standard mean differences is appropriate if there is just one single
>treatment observation. Related to that, is Hedges small sample correction
>also applicable in such circumstances?
>
>And second, how should the pooled standard deviation be calculated if the
>controls have unequal weights?
>
>A number of primary studies in my meta-analysis used the Synthetic Control
>method (see for example  <https://doi.org/10.1073/pnas.2004334117>
>https://doi.org/10.1073/pnas.2004334117). This method is used when few (e.g.
>just one) treatment units, but many potential controls are available. The
>basic idea is to compare the observed treatment outcome with a synthetic
>control, which is a weighted combination of several units from a "donor
>pool". Should I therefore calculate the SD in the control group using a
>weighting method such as Hmisc::wtd.var() in R?
>
>In my context, there are typically multiple reported differences, i.e. one
>per year, so an approach that I thought of would be to calculate the
>treatment SD based on all observations (one for each year). A 2020 working
>paper by Hollingsworth and Coady (doi.org/10.31235/osf.io/fc9xt) calculates
>a type of Cohens d by using the SD in pre-treatment periods. But if there is
>a temporal trend in both treated an control units that has nothing to do
>with the treatment, this would, if I understand it correctly, inflate the
>pooled SD and deflate the effect estimate. I therefore consider an approach
>that uses only the observations from one given year more useful. These can
>either be aggregated to an overall mean, or analyzed individually as
>dependent effect sizes.
>
>Looking forward to hearing your feedback and ideas!
>
>Kind regards
>Dario Schulz
>
>--
>
>Doctoral Student
>PhenoRob (Cluster of Excellence)
>Institute for Food and Resource Economics (ILR)
>University of Bonn
>Nu�allee 21, D-53115 Bonn
>
>Email: dario.schulz using ilr.uni-bonn.de <mailto:dario.schulz using ilr.uni-bonn.de>