# [R-meta] Percent Correct as outcome variable

Viechtbauer Wolfgang (SP) wolfgang.viechtbauer at maastrichtuniversity.nl
Mon Feb 5 15:06:44 CET 2018

```M_X and M_Y are mean proportions. So we know that as M_X gets close to 0 or 1, the variance must decrease and in fact must be 0 if M_X is equal to 0 or 1. On the other hand, for values of M_X around 0.5, the variance should tend to be larger.

You might want to plot M_X against Var(X) (or SD(X)) for those studies where both M_X and Var(X) are known to see what the relationship is. It should look something like this:

Var(X)
|    ***
|  ** * **
| * * ** **
|*         *
0+----------- M(X)
0          1

The number of items is also a factor here. The variance between the observed proportions is a function of between-person differences in the person's underlying true proportions and variance due to differences between the observed proportions and the underlying true proportions. The latter variance will decrease as the number of items increases. So at least part of the differences in Var(X) across studies can be accounted for by the number of items.

So in the end, if you want to try something a bit more sophisticated than just plugging in one best guess, you could try to fit a regression model to predict Var(X) based on M(X) (allowing for a curvlinear relationship) plus the number of items (or some function thereof like 1/(number of items)) as predictors (and maybe interaction terms). If it turns out that Var(X) can be predicted quite well based on such a model, you could then impute missing Var(X) values for studies where you only know M(X) and the number of items. The same would go for Var(Y).

I am actually currently working on a meta-analysis where the same issue has come up. I find exactly the curvlinear relationship as described above in these data (I use SD(X) on the y-axis). The model with M(X) and M(X)^2 as predictors gives me an R^2 of 0.31. Adding 1/(number of items) as predictor increases this to R^2 = 0.44. I have some more relevant predictors that push R^2 easily above 0.5. So I am quite confident that one can get fairly good predicted values for studies where Var(X) is missing using such an approach.

Best,
Wolfgang

>-----Original Message-----
>From: Markus Janczyk [mailto:markus.janczyk at uni-tuebingen.de]
>Sent: Sunday, 04 February, 2018 13:48
>To: Viechtbauer Wolfgang (SP); r-sig-meta-analysis at r-project.org
>Subject: Re: [R-meta] Percent Correct as outcome variable
>
>Thanks Wolfgang, it is exactly the problem that moste often no
>within-group variances were reported, nor were the relevant t-tests to
>back-calculate the variance... Is there still a method for a reasonable
>imputation of the variance? Something like a best guess? We know the
>sample size of course, sometimes the number of items per participants
>that was administered and could be right or wrong rememberd.
>
>Best, Markus
>
>Am 04.02.2018 um 13:33 schrieb Viechtbauer Wolfgang (SP):
>> Dear Markus,
>>
>> Sure, you can compute D = M_X - M_Y. Since M_X and M_Y are means, this
>is a mean difference. This can be easily handled by metafor, meta, etc.
>In metafor, you can compute mean differences with escalc(measure="MD",
>...) and then pass those to rma(). In meta, there is metacont(sm="MD",
>...).
>>
>> The only problem is that in order to compute the sampling variance of a
>mean difference, you need the within-group variances, since:
>>
>> var(D) = Var(X) / n_X + Var(Y) / n_Y.
>>
>> So, unless you impute the missing within-group variances, you cannot
>include such studies when using standard meta-analytic procedures.
>>
>> If you do know the t-test statistic or p-value (from which one could
>back-calculate the t-statistic), then:
>>
>> Var(pooled) = D^2 / (t^2 * (1/n_X + 1/n_Y))
>>
>> so then you can back-calculate the pooled within-group variance
>(assuming the t-test computed was also computed based on the pooled
>variance) and then we could assume Var(pooled) = Var_X = Var_Y for
>computing var(D).
>>
>> Best,
>> Wolfgang
>>
>>> -----Original Message-----
>>> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-
>>> project.org] On Behalf Of Markus Janczyk
>>> Sent: Monday, 29 January, 2018 11:05
>>> To: r-sig-meta-analysis at r-project.org
>>> Subject: [R-meta] Percent Correct as outcome variable
>>>
>>> Dear everybody,
>>>
>>> We are interested in a meta-analytic question related to memory
>research.
>>>
>>> Consider the case where two conditions X and Y are compared.
>>> Unfortunately, in many of the (old) papers only the means M_X and M_Y
>>> are reported for the percent correct remembered items (and often not
>>> even the interesting t-test). To calcutale a "real" effect size as the
>>> outcome measure to use in a meta-analysis, we need to some variability
>>> measure though (if I understood the metafor functions right). With the
>>> data we have, it feels like a raw (and non-standardized) effect we can
>>> caluculate as the outcome variable (i.e., D = M_X - M_Y).
>>>
>>> Is there any other possible solution or improvement somebody knows of
>or
>>> recommends or just somebody who can let me know a reference where I
>can
>>> look?
>>>
>>> Thanks, Markus
>>>
>>> --
>>> Jun.-Prof. Dr. phil. habil. Markus Janczyk, Dipl.-Psych.
>>>
>>> University of Tübingen
>>> Department of Psychology
>>> Cognition and Action
>>> Schleichstraße 4
>>> 72076 Tübingen
>>> Germany
>>>
>>> http://www.pi.uni-tuebingen.de/arbeitsbereiche/kognition-und-
>>> handlung/research-group.html
>>> email: markus.janczyk at uni-tuebingen.de
>>> phone: +49 (0)7071 2976761
```