[R-meta] Redundant predictors

Mon Jun 29 16:37:02 CEST 2020

Dear Arne,

Please keep the mailing list in cc.

Indeed, adding 'extreme' effects could drive up the heterogeneity to the point that reaching a significant result becomes difficult or even impossible. And yes, you could fix the variance component(s) to avoid this. Alternatively, instead of adding very large effects, one could add effects that have the same size as the average effect estimated from the initial model. That would have the opposite effect, driving down heterogeneity as more and more such effects are added.

Even better would be an approach where we simulate new effects taking the amount of heterogeneity and the sampling variability into consideration. For a given number of 'new' effects to be added, one would then repeat this many times, checking in what proportion of cases the combined effect is significant. By increasing the number of new effects to be added, one could then figure out how many effects need to be added such that power to find a significant effect is at least 80% (or some other %). Here is an example of this idea:

library(metafor)

yi <- c(0.22, -0.12, 0.41, 0.13, 0.08)
vi <- c(0.008, 0.002, 0.019, 0.010, 0.0145)

res <- rma(yi, vi, method="DL")
res

iters <- 1000

maxj <- 20

power <- rep(NA, maxj)
pvals <- rep(NA, iters)

set.seed(42)

for (j in 1:maxj) {
   print(j)
   for (l in 1:iters) {
      yi.fsn <- c(yi, rnorm(j, coef(res), sqrt(res$tau2 + 1/mean(1/vi))))
      vi.fsn <- c(vi, rep(1/mean(1/vi), j))
      pvals[l] <- rma(yi.fsn, vi.fsn, method="DL")$pval
   }
   power[j] <- mean(pvals <= .05)
}

plot(1:maxj, power, type="o")
abline(h=.80, lty="dotted")
min(which(power >= .80))

So, 15 effects would have to be added to reach 80% power. Note that the line is a bit wiggly, but one could just increase the number of iterations to smooth it out.

Best,
Wolfgang

>-----Original Message-----
>From: Arne Janssen [mailto:arne.janssen using uva.nl]
>Sent: Monday, 29 June, 2020 2:25
>To: Viechtbauer, Wolfgang (SP)
>Subject: Re: [R-meta] Redundant predictors
>
>Dear Wolfgang,
>
>I did try something like you suggested just now. With a simple rma it
>works by simply putting the model calculation in a loop and adding the
>most extreme case at each step until p< 0.05.
>When I use rma.mv with a random factor, it does not. The estimate
>decreases of course, but the s.e. and p increase, so I guess the
>heterogeneity is increasing much indeed. I am now considering to use the
>tau of the original model to get a conservative estimate of the numbers
>of studies needed to get a significant effect. What are your thoughts on
>this?
>
>Thanks a lot in advance.
>
>Best wishes,
>Arne
>
>On 28-Jun-20 17:37, Viechtbauer, Wolfgang (SP) wrote:
>> Dear Arne,
>>
>> Your understanding of the fail-safe N is correct, although the way this
>number is often computed makes use of Stouffer's method for pooling the p-
>values of the studies and doesn't actually make use the effect sizes. To
>illustrate:
>>
>> library(metafor)
>>
>> yi<- c(0.1, 0.07, 0.26, 0.24, 0.19, -0.02, 0.09, -0.04, 0.18, -0.08, -
>0.18, 0.3, -0.09, 0.06, 0.15, -0.05)
>> vi<- c(0.00943, 0.00134, 0.01923, 0.00962, 0.01449, 0.01613, 0.00585,
>0.0031, 0.01818, 0.0013, 0.01887, 0.01136, 0.00885, 0.00187, 0.00645,
>0.01613)
>>
>> rma(yi, vi, method="FE")
>>
>> So a meta-analysis (using a FE model) yields a significant effect. Now
>let's compute the fail-safe N:
>>
>> fsn(yi, vi)
>>
>> This says that 35 studies with a null result would yield a non-significant
>effect. This approach uses Stouffer's method for pooling the (one-sided) p-
>values. The p-value for the 16 studies using this method is obtained with:
>>
>> zi<- yi / sqrt(vi)
>> pnorm(sum(zi) / sqrt(16), lower.tail=FALSE)
>>
>> So, if we would add 35 studies that have (on average) a z-statistic of 0,
>then we would get a non-significant pooled p-value:
>>
>> pnorm(sum(zi) / sqrt(16 + 35), lower.tail=FALSE)
>>
>> So that checks out. But note that this makes no reference to effects - it
>just uses the p-values and hence z-statistics of the studies.
>>
>> An approach that is based on the same idea as a FE model is the one by
>Rosenberg:
>>
>> fsn(yi, vi, type="Rosenberg")
>>
>> This says that 3 studies with null effects would need to be added to
>render the FE model non-significant (where the sampling variances of those 3
>effects are assumed to be equal to the harmonic mean of the sampling
>variances of the observed effects). So in other words:
>>
>> yi.fsn<- c(yi, rep(max(yi), 3))
>> vi.fsn<- c(vi, rep(1/mean(1/vi), 3))
>> rma(yi.fsn, vi.fsn, method="FE")
>>
>> And indeed, that just fails to be significant at alpha = .05.
>>
>> Using the same idea, we could reverse this process. Let's say we start
>with these effects, which yield a non-significant result based on a FE model
>(p = .29):
>>
>> yi<- c(0.05, 0.07, 0.10, 0.14, 0.02, -0.15, 0.09, -0.04, 0.11, -0.08, -
>0.18, 0.22, -0.09, 0.06, 0.11, -0.05)
>> rma(yi, vi, method="FE")
>>
>> Now by trial-and-error, I can easily figure out that 2 studies with an
>effect equal to the maximum observed effect are needed to make the FE model
>significant:
>>
>> yi.fsn<- c(yi, rep(max(yi), 2))
>> vi.fsn<- c(vi, rep(1/mean(1/vi), 2))
>> rma(yi.fsn, vi.fsn, method="FE")
>>
>> If one would want to do this in the context of a RE model, things get more
>tricky because one would have to factor in the between-study variance
>component. So, we start with:
>>
>> res<- rma(yi, vi)
>> res
>>
>> Now we could assume that the new studies being added come from the same
>population of studies and their addition does not alter the estimate of
>tau^2. Then we again just need to add 2 studies here:
>>
>> yi.fsn<- c(yi, rep(max(yi), 2))
>> vi.fsn<- c(vi, rep(1/mean(1/vi), 2))
>> rma(yi.fsn, vi.fsn, tau2=res$tau2)
>>
>> Note that I fix tau^2 to the value obtained from the RE model fitted to
>the observed data based on my earlier assumption.
>>
>> But adding studies with such large effects does actually drive up the
>heterogeneity, so if one were to reestimate tau^2, the result would not be
>significant:
>>
>> rma(yi.fsn, vi.fsn)
>>
>> If we do that, then we would need to add 3 studies:
>>
>> yi.fsn<- c(yi, rep(max(yi), 3))
>> vi.fsn<- c(vi, rep(1/mean(1/vi), 3))
>> rma(yi.fsn, vi.fsn)
>>
>> In case you try something like this with your own data, I would be
>interested in hearing what you find.
>>
>> Best,
>> Wolfgang
>>
>>> -----Original Message-----
>>> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-
>project.org]
>>> On Behalf Of Arne Janssen
>>> Sent: Sunday, 28 June, 2020 16:31
>>> To: 'r-sig-meta-analysis using r-project.org'
>>> Subject: [R-meta] Redundant predictors
>>>
>>> L.S.,
>>>
>>> As far as I understand, the fail-safe N analysis serves to estimate the
>>> number of cases with zero effect size that would have to be added to
>>> turn a significant effect size just not significant anymore. Is there
>>> also an opposite test, i.e. how many cases with significant effect (for
>>> example the case with the most extreme effect size in the dataset) that
>>> would have to be added to turn a non-significant effect size into a
>>> significant one?
>>>
>>> Best wishes,
>>> Arne Janssen