[R-meta] Calculation of p values in selmodel

James Pustejovsky jepu@to @end|ng |rom gm@||@com
Sun Mar 17 17:46:40 CET 2024


This is an issue with maximum likelihood estimation of the step function
selection model generally (rather than a problem with the software
implementation).

The step function model assumes that there are different selection
probabilities for effect size estimates with p-values that fall into
different intervals. For a 3-parameter model, the intervals are [0, .025]
and (.025, 1], with the first interval fixed to have selection probability
1 and the second interval having selection probability lambda > 0 (an
unknown parameter of the model). If there are no observed ES estimates in
the first interval, then the ML estimate of lambda is infinite. If there
are no observed ES estimates in the second interval, then the ML estimate
of lambda is zero, outside of the parameter space.

In some of my work, I've implemented an ad hoc fix for the issue by moving
the p-value threshold around so that there are at least three ES estimates
in each interval. This isn't based on any principle in particular, although
Jack Vevea once suggested to me that this might be the sort of thing an
analyst might do just to get the model to converge.

A more principled way to fix the issue would be to use penalized likelihood
or Bayesian methods with an informative prior on lambda. See the publipha
package (https://cran.r-project.org/package=publipha) for one
implementation.

James

On Sat, Mar 16, 2024 at 10:23 PM Will Hopkins via R-sig-meta-analysis <
r-sig-meta-analysis using r-project.org> wrote:

> No-one has responded to this issue. It's now causing a problem in my
> simulations when I am analyzing for publication bias arising from deletion
> of 90% of nonsignificant study estimates and ending up with small numbers
> (10-30) of included studies. See below (and attached as an easier-to-read
> text file) for an example. Two of the 14 study estimates (Row 8 and 9) were
> non-significant, but the original t value (tOrig) would have made them
> significant in selmodel(…, type = "step", steps = (0.025)). So I processed
> any observations with non-significant p values and t>1.96 by replacing the
> standard error (YdelSE) with Ydelta/1.95. The resulting new t vslues (tNew)
> are 1.95 for both those observations, whereas all the other t values are
> unchanged. So they should be non-significant in selmodel, right?  But I
> still get this error message:
>
> Error in selmodel.rma.uni(x, type = "step", steps = (0.025)) :
>
>   One or more intervals do not contain any observed p-values (use
> 'verbose=TRUE' to see which).
>
> I must be doing something idiotic, but what? Help, please!
>
>
>
> Oh, and thanks again to Tobias Saueressig for his help with
> list-processing of the objects created by rma, selmodel and confint. My
> original for-loop approach fell over when the values of the Sim variable
> were not consecutive integers (for example, when I had generated the sims
> and then deleted any lacking non-significant study estimates), but separate
> processing of the lists as suggested by Tobias worked perfectly. It stops
> working when it crashes out with the above error, but hopefully someone
> will solve that problem.
>
>
>
> Will
>
>
>
>                 Sim        StudID  Sex        SSize    Ydelta  YdelSE
> tOrig     tNew    pValue
>
>                 <dbl>    <dbl>    <fct>     <dbl>    <dbl>    <dbl>
> <dbl>    <dbl>    <dbl>
>
> 1             448        1             Female 10           3.72
> 0.684    5.44       5.44       0.000413
>
> 2             448        6             Female 10           3.08
> 0.901    3.42       3.42       0.00766
>
> 3             448        11           Female 10           4.49
> 0.926    4.85       4.85       0.000906
>
> 4             448        21           Female 28           4.95
> 0.777    6.37       6.37       0.000000808
>
> 5             448        26           Female 12           3.82
> 1.25       3.06       3.06       0.0109
>
> 6             448        31           Female 22           2.13
> 0.991    2.15       2.15       0.0433
>
> 7             448        36           Female 10           3.27
> 1.13       2.89       2.89       0.0177
>
> 8             448        10           Male     18           4.46
> 2.29       2.03       1.95       0.0578
>
> 9             448        14           Male     10           3.2
> 1.64       1.98       1.95       0.0795
>
> 10           448        17           Male     13           4.32
> 1.97       2.19       2.19       0.049
>
> 11           448        30           Male     10           1.16
> 0.467    2.48       2.48       0.0348
>
> 12           448        38           Male     10           3.61
> 1.24       2.91       2.91       0.0175
>
> 13           448        39           Male     10           2.49
> 0.828    3.01       3.01       0.0148
>
> 14           448        40           Male     28           1.92
> 0.602    3.19       3.19       0.0036
>
>
>
> *From:* Will Hopkins <willthekiwi using gmail.com>
> *Sent:* Friday, March 15, 2024 8:39 AM
> *To:* 'R Special Interest Group for Meta-Analysis' <
> r-sig-meta-analysis using r-project.org>
> *Subject:* Calculation of p values in selmodel
>
>
>
> According to your documentation, Wolfgang, the selection models in
> selmodel are based on the p values of the study estimates, but these are
> computed by assuming the study estimate divided by its standard error has a
> normal distribution, whereas significance in the original studies of mean
> effects of continuous variables would have been based on a t distribution.
> It could make a difference when sample sizes in the original studies are
> ~10 or so, because some originally non-significant effects would be treated
> as significant by selmodel. For example, with a sample size of 10, a mean
> change has 9 degrees of freedom, so a p value of 0.080 (i.e.,
> non-significant, p>0.05) in the original study will be given a p value of
> 0.049 (i.e., significant, p<0.05) by selmodel. Is this issue likely to make
> any real difference to the performance of selmodel with meta-analyses of
> realistic small-sample studies? I guess that only a small (negligible?)
> proportion of p values will fall between 0.05 and 0.08, in the worst-case
> scenario of a true effect close to the critical value and with only 9
> degrees of freedom for the SE. If it is an issue, you could include the
> SE's degrees of freedom in the rma object that gets passed to selmodel.
>
>
>
> Will
> _______________________________________________
> R-sig-meta-analysis mailing list @ R-sig-meta-analysis using r-project.org
> To manage your subscription to this mailing list, go to:
> https://stat.ethz.ch/mailman/listinfo/r-sig-meta-analysis
>

	[[alternative HTML version deleted]]



More information about the R-sig-meta-analysis mailing list