[R-meta] Selection models from *reported p-values*

James Pustejovsky jepu@to @end|ng |rom gm@||@com
Tue Mar 5 17:55:38 CET 2024


Yashvin,

This is an interesting question, which highlights a potential limitation of
existing meta-analytic selection models (at least those that I'm aware of).

Just to add a thought to Wolfgang's response: the reason that it would be
difficult to modify existing selection models to work with observed
p-values is that current models assume that the p-value is a direct
function of the effect size estimate and its standard error, and the effect
size estimates are the _outcomes_ in the model. So the model implies a
_distribution_ of p-values based on the data-generating process, and we
need to know what that distribution is. In particular, to work with an
observed p-value, we would need to know how the observed p-value is
functionally related to the effect size estimate, and this will depend on
lots of details about the effect size metric, study design, and analytic
methods (your method of calculating the effect size estimate and the
authors' method of calculating p-values).

For some types of transformations, I think the discrepancies will be quite
small.
* For example, say that the author reports a p-value for an untransformed
correlation coefficient, but you meta-analyze the results based on Fisher
z-transformation. For r near zero, the SE of the untransformed coefficient
will be quite close to the SE of the z-transformed coefficient, so using
one or the other will not make much difference at all.
* For another example, say that you do a multiplicative reliability
correction to a correlation coefficient. In this case, the SE of the
corrected coefficient should also be multiplied by the reliability
correction (that is, if we're treating the correction as a fixed constant),
and so the ratio of the corrected correlation to the corrected SE will be
the same as the ratio of the uncorrected correlation to the uncorrected SE,
and the p-value should be the same in both cases.

Finally, here's a potentially more problematic/controversial
counter-example. Say that you are meta-analyzing standardized mean
differences from randomized experiments with pre-test and post-test data,
and for sake of uniformity you are using a difference-in-differences
estimate for the numerator. But some of the primary studies use ANCOVA for
their analysis, so your ES estimate and SE and p-value will differ from
those based on the analysis reported in the primary study. Your analysis is
less precise than the primary study analysis, so your p-value will tend to
be larger than the primary study p-value. Further, maybe you are making an
assumption about the pre/post correlation rather than using the primary
study data to infer it, and this will introduce a further discrepancy.
Personally, I don't have a sense of how big a discrepancy in p-values you
can get in this situation. I think it's an interesting question that would
be worth looking into (and maybe carrying it through to investigating the
implications for the performance of meta-analytic selection models). But
pragmatically, the discrepancy could be resolved by using the information
from the primary analytic approach (ANCOVA) to calculate the effect size
estimate and its standard error, at least to the extent that this is
possible given the statistics reported in the primary study.

Best,
James

On Tue, Mar 5, 2024 at 7:17 AM Viechtbauer, Wolfgang (NP) via
R-sig-meta-analysis <r-sig-meta-analysis using r-project.org> wrote:

> Dear Yashvin,
>
> I haven't thought this all the way through, but the problem with this is
> that p-values enter the model in two different ways. There are indeed the
> actually observed p-values of the studies, but in the integration step
> (which is needed to compute the log likelihood), we also need to compute
> p-values. Those are not fixed, but arise from integrating over the density
> (assumed to be normal) of the effect size estimates. These p-values (which
> then enter the weight function) are computed as a function of y/sqrt(vi).
> If we use one way of computing the observed p-values and a different way of
> computing the p-values in this integration step, then there is a bit of a
> mismatch and I am not sure about the consequences of that. So for
> consistency, one should then also compute the p-values in the integration
> step in a corresponding manner, but this would be very case/measure/test
> specific and trying to fine-tune this for every specific measure and way of
> testing it becomes extremely difficult imple
>  mentation-wise.
>
> We can see a bit of this in Iyengar and Greenhouse (1988) where the weight
> function is based on a t instead of a normal distribution (analogous to a
> z- versus a t-test). But this leads to the extra headache inducing
> complexities in their appendix. I (and others) decided to avoid all of this
> by making the simplifying assumption that the p-values are always computed
> based on Wald-type tests of the form 'estimate / SE'.
>
> This should not be too far off in many cases, especially if the sample
> sizes within studies are not small. For example, the difference between
> pnorm(2, lower.tail=FALSE) and pt(2, df=100, lower.tail=FALSE) makes very
> little practical difference. Also, selection models are really rough
> approximations to a much more complex data generating mechanism anyway, so
> trying to fine-tune this part of the model is like taking a ruler to align
> something to millimeter accuracy before taking a sledge hammer to smash it.
>
> A bit like the bias correction for d-values. Whether you put d=0.53 or
> g=0.52 into your model makes so little difference compared to all the other
> inaccuracies and infidelities we accept in putting together our
> meta-analytic datasets in the first place.
>
> But those are just my two cents.
>
> Best,
> Wolfgang
>
> > -----Original Message-----
> > From: R-sig-meta-analysis <mailman-bounces using stat.ethz.ch> On Behalf Of
> Seetahul,
> > Yashvin
> > Sent: Tuesday, March 5, 2024 13:09
> > To: r-sig-meta-analysis using r-project.org
> > Cc: r-sig-meta-analysis-owner using r-project.org
> > Subject: Selection models from *reported p-values*
> >
> > Dear R meta-analysis community,
> >
> > I have a question with regards to selection models based on p-values.
> >
> > Is it possible to do the selection model based on reported p-values
> directly
> > rather than the p-values calculated from the effect size and SE?
> >
> > In many cases, meta-analyses require transformations, or sometimes
> corrections.
> > However, if we assume that there is a selection process in publishing
> papers
> > that is based on the p-values, it would make more sense to consider the
> p-values
> > that are reported in the papers, would it not?.
> >
> > How would one proceed to do this? I believe the selmodel() function in
> metafor
> > works with objects fitted with the rma() function, therefore, the
> p-values are
> > re-calculated only from the effect size and SE. Assuming I have the
> reported p-
> > values (detailed up to three decimals) of all the studies included in my
> meta-
> > analysis, is it possible to test for the selection of studies based on
> reported
> > p-values and then correct the effect size?
> >
> > I hope my question makes sense,
> >
> > Thank you for your help,
> >
> > Yashvin Seetahul
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>

	[[alternative HTML version deleted]]



More information about the R-sig-meta-analysis mailing list