[R-meta] Questions about the use of metaprop for the pooling of proportions

Tue Mar 8 23:19:44 CET 2022

Happy to see other people spending time at 11pm thinking about this kind of stuff :)

If we want to be really precise, the MLE of the logit-transformed true proportion is qlogis((sum r_i)/(sum n_i)) for the logistic regression model with a logit link, but since MLEs are invariant under transformations, so plogis(qlogis((sum r_i)/(sum n_i))) = (sum r_i)/(sum n_i)) is the MLE of the true proportion. In fact, this is neatly demonstrated by fitting the logistic regression with an identity link (do we even call this 'logistic' regression?!?):

coef(glm(out1/n ~ 1, weights = n, family = binomial(link = "identity")))

That all of this happens 'automagically' is really a neat feature of logistic regression.

Best,
Wolfgang

>-----Original Message-----
>From: Dr. Gerta Rücker [mailto:ruecker using imbi.uni-freiburg.de]
>Sent: Tuesday, 08 March, 2022 23:07
>To: Viechtbauer, Wolfgang (SP); Thiago Roza
>Cc: r-sig-meta-analysis using r-project.org
>Subject: Re: [R-meta] Questions about the use of metaprop for the pooling of
>proportions
>
>Hi Wolfgang,
>
>Thank you! Indeed I just saw that the ML estimate under the binomial
>model and the assumption of homogeneity gives (sum r_i)/(sum n_i). In
>fact this seems equivalent to logistic regression. Probably it works
>also under the multinomial model, I didn't write this down. I admit that
>I never had thought about this :(
>
>Best,
>
>Gerta
>
>Am 08.03.2022 um 22:58 schrieb Viechtbauer, Wolfgang (SP):
>> Hi Gerta,
>>
>> Under homogeneity, we have X_i ~ Binomial(n_i, pi), in which case sum(X_i) ~
>Binomial(sum(n_i), pi) and hence
>>
>> sum(out1)/sum(n)
>> plogis(coef(glm(out1/n ~ 1, weights = n, family = binomial)))
>>
>> or using metaprop() / rma.glmm()
>>
>> plogis(metaprop(out1, n)$TE.fixed)
>> plogis(coef(rma.glmm(measure="PLO", xi=out1, ni=n, method="EE")))
>>
>> are all identical. It goes to show how the logistic regression approach gives
>an 'exact' model, based on the exact distributional properties of binomial
>counts.
>>
>> As for Thiago's data: I think this is fine. But essentially he has multinomial
>data. I recently described in a post how such data could be addressed if one
>would want to analyze them all simultaneously:
>>
>> https://stat.ethz.ch/pipermail/r-sig-meta-analysis/2022-February/003878.html
>>
>> Best,
>> Wolfgang
>>
>>> -----Original Message-----
>>> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org]
>On
>>> Behalf Of Dr. Gerta Rücker
>>> Sent: Tuesday, 08 March, 2022 20:30
>>> To: Thiago Roza
>>> Cc: r-sig-meta-analysis using r-project.org
>>> Subject: Re: [R-meta] Questions about the use of metaprop for the pooling of
>>> proportions
>>>
>>> Dear Thiago,
>>>
>>> I found that, apparently, the result presented by the common effect
>>> model (=fixed effect model) is simply the sum of all entries/events over
>>> all studies, divided by the total sample size (summed up over all
>>> studies). You see this by typing the following after the code in my last
>>> e-mail:
>>>
>>> all.equal(sum(out1)/sum(n), plogis(m1$TE.fixed))
>>> all.equal(sum(out2)/sum(n), plogis(m2$TE.fixed))
>>> all.equal(sum(out3)/sum(n), plogis(m3$TE.fixed))
>>>
>>> This means that the method is equivalent to considering the data as a
>>> contingency table where the rows correspond to the studies and the
>>> columns to the outcomes. The meta-analytic result corresponds to the
>>> percentages in the column sums, and of course these add to 100%. In fact
>>> this is the easiest way to deal with this kind of data.
>>>
>>> @Guido, @Wolfgang: I couldn't find thisinformation on the metaprop or
>>> the rma.glmm help pages. Do you see any problem with interpreting
>>> Thiago's data as a contingency table? I think that, by contrast to
>>> pairwise comparison data, confounding/ecological bias is not an issue here.
>>>
>>> Best,
>>>
>>> Gerta
>>>
>>> Am 08.03.2022 um 19:30 schrieb Dr. Gerta Rücker:
>>>> Dear Thiago,
>>>>
>>>> So you have proportions of several mutually exclusive outcomes. Of
>>>> course, these are dependent because the sum is always the total
>>>> numbers of cases in the study (corresponding to 100% in that study).
>>>> Nevertheless, I don't see any reason why not pooling each outcome
>>>> separately using metaprop(). In fact, depending on the transformation,
>>>> the resulting average proportion will not generally sum up to 100%,
>>>> particularly not when using no transformation at all. This raises the
>>>> question which transformation to choose. The default in metaprop() is
>>>> random intercept logistic regression model with transformation logit.
>>>>
>>>> I made an observation that I have to think about, and you may try
>>>> this. If I use the default, the sum of the pooled percentages over all
>>>> outcomes is indeed always 1 for the fixed effect estimate. I used code
>>>> like this (here for 3 outcomes):
>>>>
>>>> #### Random data ####
>>>> out1 <- rbinom(10,100,0.1)
>>>> out2 <- rbinom(10,100,0.5)
>>>> out3 <- rbinom(10,100,0.9)
>>>> n <- out1 + out2 + out3
>>>> m1 <- metaprop(out1, n)
>>>> m2 <- metaprop(out2, n)
>>>> m3 <- metaprop(out3, n)
>>>> plogis(m1$TE.fixed) + plogis(m2$TE.fixed) + plogis(m3$TE.fixed)
>>>>
>>>> (plogis is the inverse of the logit transformation, often called
>>>> "expit": plogis(x) = exp(x)/(1 + exp(x).) These seem to sum up to 1
>>>> for the fixed effect estimates, but not in general for the random
>>>> effects estimates, only in case of small heterogeneity (which is
>>>> rarely the case with proportions).
>>>>
>>>> I am interested to hear whether this works with your data. (And I have
>>>> to prove that this holds in general ...)
>>>>
>>>> Best,
>>>>
>>>> Gerta