[R-meta] Dependant variable in Meta Analysis

Thu Aug 27 15:23:47 CEST 2020

Hi Tarun,

For 1, exp(b1)*100 is the percent change, not b1*100.

For 2, if you know b0 and b1, then you know the mean of y for x=0 (b0) and the mean of y for x=1 (b0+b1). Now you also need the SD for x=0 and the SD for x=1, but this can't be recovered. However, if you know the MSE, then the square-root of that is the pooled within-group SD, so you can also use that. And you need to know the number of observations where x=0 and where x=1 (so those are the two group sizes, n0 and n1). Then you have everything to compute the ROM and its sampling variance.

If you don't know the MSE but the SE of b1 (or t = b1/SE[b1] from which one can easily recover SE[b1] or the p-value which one can transform into t, which then gives you the SE), then one can easily back-calculate the MSE from that (assuming you know n0 and n1), since

MSE = SE[b1]^2 * sum((x_i - mean(x))^2)

The second term can be computed if you know n0 and n1, since:

sum((x_i - mean(x))^2) = n0 * (0 - n1/(n0+n1))^2 + n1 * (1 - n1/(n0+n1))^2.

One can simplify this equation further, but this should make it clear that mean(x) is just the proportion of 1's and x_i can only take on two different values here (0 and 1).

For 3, as discussed, you can use ROM.

For 4, you are out of luck. You need the means of the two groups (to compute ROM and its variance), but if you only know their difference, then this is not sufficient.

Best,
Wolfgang

>-----Original Message-----
>From: Tarun Khanna [mailto:khanna using hertie-school.org]
>Sent: Friday, 21 August, 2020 12:33
>To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis using r-project.org
>Subject: Re: Dependant variable in Meta Analysis
>
>Thank you so much for all the insights so far. I am very grateful and
>looking forward to learning more in the meta analysis course in October.
>
>I wanted to follow up on my question about dependant variable in meta-
>analysis. Just to summarize the discussion where we last left it. In the
>meta analysis that I am doing, there are 4 kinds of studies.
>
>1. studies that estimate the equation ln (y) = b0 + b1x + e, where x is a
>dummy variable that distinguishes two groups (e.g., x = 0 for group 1 and x
>= 1 for group 2)
>2. studies that estimate the equation y = b0 + b1x + e, x is a dummy
>variable that distinguishes two groups (e.g., x = 0 for group 1 and x = 1
>for group 2)
>3. studies that report mean and standard deviations of the two groups (mean
>and sd of y for x = 0 and x = 1)
>4. studies that report the difference between the means of the two groups
>and the pooled standard deviation (mean and standard deviation of y at x = 1
>-  y at x = 0)
>
>For the purpose of our meta analysis, studies of type 1 are most useful
>because b1*100 has the nice interpretation of percent change in y when x =
>1. Ideally I would like to transform the other studies so that I can retain
>this interpretation even in case of the aggregated estimated effect size.
>
>You had earlier recommended transforming estimates from studies of type 3 to
>ROM so that they are comparable to estimates from studies with ln (y) as
>dependant variable (Jensen's inequality aside). Could you perhaps also
>recommend a way to transform studies of the type 2 and 4 so that we that we
>can retain the interpretation of the overall effect size to be "percentage
>change in y when x = 1"?
>
>Of course if that's not possible I would use the r_coefficients to calculate
>the aggregate effect size.
>
>Thank for you your help and patience.
>
>Best
>Tarun
>
>Tarun Khanna
>PhD Researcher
>
>Hertie School
>
>Friedrichstraße 180
>10117 Berlin ∙ Germany
>khanna using hertie-school.org ∙ www.hertie-school.org
>________________________________________
>From: Viechtbauer, Wolfgang (SP)
><wolfgang.viechtbauer using maastrichtuniversity.nl>
>Sent: 07 June 2020 14:32:57
>To: Tarun Khanna; r-sig-meta-analysis using r-project.org
>Subject: RE: Dependant variable in Meta Analysis
>
>See responses below.
>
>>-----Original Message-----
>>From: Tarun Khanna [mailto:khanna using hertie-school.org]
>>Sent: Friday, 05 June, 2020 21:48
>>To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis using r-project.org
>>Subject: Re: Dependant variable in Meta Analysis
>>
>>Thank you for your clear answer.
>>
>>As you correctly said, most of the studies in my set use models of the form
>>ln(y) = b0 + b1 + e. Can we relax the requirement of units of measurement
>of
>>y in this case because the interpretation of b1 is % change in y for unit
>>change in x?
>
>b1 is not % change, exp(b1) is. But yes, one could combine estimates of b1
>from different studies even if the units of y differ across studies, as long
>as they only differ by a multiplicative transformation.
>
>>While most of the studies in my set employ regression models, some employ
>>difference of means test (with the group means and standard error
>reported).
>>How can I calculate coefficients in this case that are commensurable to the
>>ones coming from studies that employ the regression models? Would
>converting
>>the means to percentage change work? For example if mt is treatment mean
>and
>>ct is control mean, then is the percentage difference mt-ct/ct
>commensurable
>>with estimates coming from the regression? A previous meta analysis in the
>>field does this but I am not sure if this is correct.
>
>In the model ln(y) = b0 + b1 x + e, if x is a dummy variable that
>distinguishes two groups (e.g., x = 0 for group 1 and x = 1 for group 2),
>then b1 is the estimated mean difference of log(y) for the two groups.
>That's similar (but not the same -- see below) to using the log-transformed
>ratio of means as the effect size measure. See help(escalc) and search for
>"ROM". Using (mt-mc)/mc would not be correct to use, since b1 is not %
>change, but log-transformed % change. And log((mt-mc)/mc) = log(mt/mc - 1),
>which is like ROM, but not quite right (due to the -1).
>
>The reason why using ROM isn't quite right is due to Jensen's inequality
>(https://en.wikipedia.org/wiki/Jensen's_inequality). b1 in the regression
>model is mean(log(y) for group 1) - mean(log(y) for group 2). However, you
>have mean(y for group 1) and mean(y for group 2) and when you compute "ROM"
>based on this, you get log(mean(y for group 1)) - log(mean(y for group 2)).
>These two mean differences are not the same. They might not differ greatly
>though. An example:
>
>set.seed(1234)
>x <- c(rep(0,50), rep(1,50))
>y <- 100 + 5 * x + rnorm(100, 0, 10)
>lm(log(y) ~ x)
>mean(log(y)[x==1]) - mean(log(y)[x==0])
>log(mean(y[x==1])) - log(mean(y[x==0])) # ROM
>escalc(measure="ROM", m1i=mean(y[x==1]), m2i=mean(y[x==0]),
>sd1i=sd(y[x==1]), sd2i=sd(y[x==0]), n1i=50, n2i=50)
>
>So, with this caveat aside (but discussed as part of the limitations), I
>would use ROM for those studies. You can also code 'b1 used vs ROM used' as
>a dummy variable and examine empirically via meta-regression if there are
>systematic differences between these two cases (although those could stem
>from other things besides Jensen's inequality).
>
>Best,
>Wolfgang
>
>>From: Viechtbauer, Wolfgang (SP)
>><wolfgang.viechtbauer using maastrichtuniversity.nl>
>>Sent: 04 June 2020 15:10:04
>>To: Tarun Khanna; r-sig-meta-analysis using r-project.org
>>Subject: RE: Dependant variable in Meta Analysis
>>
>>Assuming that the coefficients are commensurable, you can just meta-analyze
>>them directly. The squared standard errors of the coefficients are then the
>>sampling variances.
>>
>>With commensurable, I mean that they measure the same thing and can be
>>directly compared. For example, suppose the regression model y = b0 + b1 x
>+
>>e has been examined in multiple studies. Since b1 reflects how many units y
>>changes (on average) for a one-unit increase in x, the coefficient b1 is
>>only comparable across studies if y has been measured in the same units
>>across studies and x has been measured in the same units across studies (or
>>if there is a known linear transformation that converts x from one study
>>into the x from another study (and the same for y), then one can adjust b1
>>to make it commensurable across studies).
>>
>>In certain models, one can relax the requirement that the units must be the
>>same. For example, if the model is ln(y) = b0 + b1 x + e, then the units of
>>y can actually differ across studies if they are multiplicative
>>transformations of each other. If the model is ln(y) = b0 + b1 ln(x) + e,
>>then x can also differ across studies in terms of a multiplicative
>>transformation.
>>
>>I think the latter gets close to (or is?) what people in economics do to
>>estimate 'elasticities' and this is in fact what you might be dealing with.
>>
>>Another complexity comes into play when there are other x's in the model.
>>Strictly speaking, all models should include the same set of predictors as
>>otherwise the coefficient of interest is 'adjusted for' different sets of
>>covariates, which again makes it incommensurable. As a rough approximation
>>to deal with different sets of covariates across studies, one could fit a
>>meta-regression model (with the coefficient of interest as outcome) where
>>one uses dummy variables to indicate for each study which covariates were
>>included in the original regression models.
>>
>>Best,
>>Wolfgang
>>
>>>-----Original Message-----
>>>From: Tarun Khanna [mailto:khanna using hertie-school.org]
>>>Sent: Thursday, 04 June, 2020 14:16
>>>To: Viechtbauer, Wolfgang (SP); r-sig-meta-analysis using r-project.org
>>>Subject: Re: Dependant variable in Meta Analysis
>>>
>>>Thank you for your reply Wolfgang.
>>>
>>>The "beta coefficients" that I refer to are not standardized regression
>>>coefficients but the relevant regression coefficients in the original
>>>studies. Would it be correct to direcly meta analyze the coefficients even
>>>when they are not standardized? How to we take into account the standard
>>>error of the coefficients? I have seen meta analysis in the literature
>that
>>>use the tranformation beta coefficient/ (sample size)^1/2 but I don't see
>>>how that takes into account the associated standard error.
>>>
>>>I have instead been calculating r coefficients using the t values of the
>>>relevant coefficients and the sample size using the following formula.
>>>
>>>r = ( t^2 / (t^2 + sample size) )^1/2
>>>
>>>I have been using the r to Fisher's Z transformation that you
>>>mentioned. Unfortunately, like you mentioned most of the studies
>>>employ multivariate analysis and so the transformation is not accurate.
>>What
>>>would be the correct way to handle this?
>>>
>>>Best
>>>Tarun
>>>
>>>Tarun Khanna
>>>PhD Researcher
>>>
>>>Hertie School
>>>
>>>Friedrichstraße 180
>>>10117 Berlin ∙ Germany
>>>khanna using hertie-school.org ∙ www.hertie-school.org
>>>________________________________________
>>>From: Viechtbauer, Wolfgang (SP)
>>><wolfgang.viechtbauer using maastrichtuniversity.nl>
>>>Sent: 04 June 2020 13:56:59
>>>To: Tarun Khanna; r-sig-meta-analysis using r-project.org
>>>Subject: RE: Dependant variable in Meta Analysis
>>>
>>>Dear Tarun,
>>>
>>>What exactly do you mean by 'beta coefficient'? A standardized regression
>>>coefficient? In the (very unlikely) case that the model includes no other
>>>predictors and is just a standard regression model, then the standardized
>>>regression coefficient for that single predictor is actually identical to
>>>the correlation beteen the predictor and the outcome and converting this
>>>correlation via Fisher's r-to-z transformation is fine (and then 1/(n-3)
>>can
>>>be used as the corresponding sampling variance). However, if there are
>>other
>>>predictors in the model, then the standardized regression coefficient is
>>not
>>>a simple correlation and while one can still apply Fisher's r-to-z
>>>transformation to the coefficient, it will not have a variance of 1/(n-3)
>>>and assuming so would be wrong.
>>>
>>>Why don't you just meta-analyze the 'beta coefficients' directly? If these
>>>coefficients reflect percentage change, it sounds like they are 'unitless'
>>>and comparable across studies. Then you get the pooled estimate of the
>>>percentage change directly from the model.
>>>
>>>Best,
>>>Wolfgang
>>>
>>>>-----Original Message-----
>>>>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-
>>>project.org]
>>>>On Behalf Of Tarun Khanna
>>>>Sent: Thursday, 04 June, 2020 13:41
>>>>To: r-sig-meta-analysis using r-project.org
>>>>Subject: [R-meta] Dependant variable in Meta Analysis
>>>>
>>>>Dear All,
>>>>
>>>>I am conducting a meta analysis of reduction in energy consumption in
>>>>households that have been exposed to certain behavioural interventions in
>>>>trials. The beta coefficients in the regressions in my the original
>>studies
>>>>can ususally be interpreted as percentage change in electricity
>>>consumption.
>>>>To do the meta analysis I am converting these beta coefficients to
>>Fisher's
>>>>Z. My problem is that Fisher's Z is not as easy to interpret as
>percentage
>>>>change in energy consumption.
>>>>
>>>>Question 1: Is it possible to do the meta anlysis using the beta
>>>>coefficients coming from the original studies so that the results remain
>>>>easy to interpret?
>>>>
>>>>Question 2: Is it sensible to convert the final Fisher's Z estimates back
>>>to
>>>>the dependant variable coming from the studies?
>>>>
>>>>Sorry if this question sounds too basic.
>>>>
>>>>Best
>>>>
>>>>Tarun
>>>>Tarun Khanna
>>>>PhD Researcher
>>>>Hertie School
>>>>
>>>>Friedrichstraße 180
>>>>10117 Berlin ∙ Germany
>>>>khanna using hertie-school.org ∙ www.hertie-school.org