[R-meta] Standard error of heterogeneity with rma.uni

Tue Mar 5 08:58:27 CET 2024

Sorry about "blindly copy-pasting" the code
"control=list(tau2.min=-min(vi))". I thought the vi was a variable in the
specified dataset. I didn't know it was an external data vector. I mean,
other parameters in rma.uni come from the data. I'm new to R and metafor,
remember, so I think in SAS terms.

I also didn't know how you had generated the SE for heterogeneity. I
implemented "scoring=100" in Proc Mixed, as per your suggestion, and yes, it
generates the same SE as yours. It does make the coverage of the 90%
confidence interval for the heterogeneity variance and for the fixed effects
a little suboptimal (~88%) under some small-sample circumstances, but a big
plus is that only a handful of 5000 simulations failed to converge, whereas
up to 500 of the 5000 simulations failed without scoring=100. So thank you,
the suboptimal coverage is worth it. Those other methods of estimation you
mentioned don't appear to be available in Proc Mixed, but I am happy with
the default REML. I am not skilled enough yet with R to run my simulations
therein (I have to learn "list" coding?), so the only way I can compare Proc
Mixed's confidence limits with those for the same dataset in metafor is to
specify type="PL" in the confint function. There was close agreement with
the CLs for the fixed effects, but big discrpancy for the upper CL for the
heterogeneity.

Using Proc Mixed, with 90% CLs:
CovParm	Estimate	StdErr	Lower	Upper
StudyID		0.5427		0.5316	-0.3317	1.4172

Using rma.unit:
tau^2 (estimated amount of residual heterogeneity): 0.5425 (SE = 0.5315)
Then confint(..., level=90,...):
              estimate ci.lb   ci.ub 
tau^2    0.5425    NA     2.3542 

I don't know what to make of that. Proc Mixed's upper CL comes from assuming
a normal distribution for the hetero variance. Remember, this way of doing
the CLs gives pretty darn good coverage. If your upper CL is consistency
greater than Proc Mixed's, the coverage will be way too high, won't it? I
guess I'll see what happens when I get 5000 simulations running in metafor.

Re: selmodel() does not work with negative variance for heterogeneity... So
the idea is not to allow negative point estimates of variance in rma.uni. A
substantial proportion of point estimates will want to be (and arguably
should be allowed to be) negative, when real heterogeneity is zero or
slightly positive, when sample sizes and numbers of studies are small, and
especially when there is substantial publication bias. But that's OK, you
have to set them to zero. Again, I'll have to see how well it works, when I
figure out how to do it.

Re: the discrepancy with the SE for the males... I couldn't reproduce that,
and I really did check it at the time. Let's say a cosmic ray flipped a bit!
I ran half a dozen more datasets in SAS and metafor, and the SEs of the
fixed effects all agreed.

Will

-----Original Message-----
From: Viechtbauer, Wolfgang (NP)
<wolfgang.viechtbauer using maastrichtuniversity.nl> 
Sent: Monday, March 4, 2024 11:07 PM
To: R Special Interest Group for Meta-Analysis
<r-sig-meta-analysis using r-project.org>
Cc: Will Hopkins <willthekiwi using gmail.com>
Subject: RE: [R-meta] Standard error of heterogeneity with rma.uni

Dear Will,

Please see my responses below.

Best,
Wolfgang

> -----Original Message-----
> From: R-sig-meta-analysis
<mailto:r-sig-meta-analysis-bounces using r-project.org> 
> On Behalf Of Will Hopkins via R-sig-meta-analysis
> Sent: Sunday, March 3, 2024 22:15
> To: 'R Special Interest Group for Meta-Analysis' 
> <mailto:r-sig-meta-analysis using r- project.org>
> Cc: Will Hopkins <mailto:willthekiwi using gmail.com>
> Subject: [R-meta] Standard error of heterogeneity with rma.uni
>
> After a year of denial, a colleague and I are starting to use metafor 
> instead of SAS's Proc Mixed, since finding recently that selection 
> models to adjust for publication bias are not implementable in SAS 
> (not by us, anyway, and no-one else seems to be using SAS for 
> meta-analyses). So I'm starting by importing data into R that I have 
> simulated and analyzed in SAS, to make sure I am using metafor 
> correctly. I have found that the standard error (SE) for the 
> heterogeneity variance provided by rma.uni is considerably smaller 
> than that provided by SAS, and the SE for a fixed effect can be 
> smaller in metafor, too. Here are two examples. Along the way, I have
several questions I hope you or someone can answer, Wolf.
>
> In the first example, there were 20 studies (10 female, 10 male), with 
> sample size ranging from 10 to 30, with small and moderate mean 
> changes for males and females respectively, small heterogeneity within 
> each group, and standard errors of measurement such that the mean 
> changes in most of the male studies were non-significant and most of the
females were significant.
> ("Small" is not trivial, "small" and "moderate" are not defined by 
> standardization. In the following examples, small and moderate mean 
> changes are 1 and 3, and small heterogeneity is and SD of 0.5.)
>
> The code for metafor was:
> rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=Female01, data=simraw).
> (Female01 is a dummy variable.) The solution for fixed effects was the 
> same in metafor...
>                 estimate    se
> intrcpt    0.9546       0.3819
> mods      2.1568       0.5162
> ...as in SAS...
> Effect         Estimate   StdErr
> Mean         0.9546      0.3819
> Female01  2.1568      0.5162
>
> But for the heterogeneity, metafor produced...
> tau^2 (estimated amount of residual heterogeneity):     0.2755 (SE =
0.3810)
> ...while SAS produced...
> CovParm     Estimate  StdErr
> StudyID              0.2755   0.5256
>
> As you can see, the point estimates are the same, but the SE in 
> metafor 0.72 that in SAS.  I presume SAS is correct, because I can use 
> the simulation in SAS to generate thousands of meta-analyses for given 
> population values of everything, and the coverage of the confidence 
> intervals for the estimates of heterogeneity is exactly the level of 
> confidence of the intervals (I use 90%, for various good reasons). Is 
> this a small-sample issue in the way you produce the SE in rma.uni, 
> Wolf, one that is not a problem in SAS?  Is there a solution in metafor?

SAS uses by default the inverse of the Hessian (i.e., the observed Fisher
information matrix) to calculate the SE of tau^2, while rma() uses the
expected Fisher information matrix. If you use 'scoring=100' (should be
sufficiently large), then SAS will also use the latter and the SEs are the
same. Or one can use:

simraw$id <- 1:nrow(simraw)
res2 <- rma.mv(Ydelta, YdeltaSEsq, random = ~ 1 | id, cvvc=TRUE)
round(sqrt(res2$vvc), 4)

to get the SE based on the observed Fisher information matrix.

Which is to be preferred (the observed or expected information) is an open
issue. Neither is particularly useful though in the present context, because
better CIs for tau^2 can be constructed (e.g., using the Q-profile method,
the generalized Q-statistic method, or the profile likelihood method) that
do not make use of any SE of tau^2.

> Will this matter, when I come to use selmodel?

No.

> And will selmodel work with negative values of heterogeneity variance?

No.

> In the second example, I used simulated data that produced negative 
> heterogeneity variance in SAS. I am "brave enough to step into risky 
> territory", to quote you from the documentation, Wolf, because 
> negative variance for point estimates and confidence limits are 
> necessary to get realistic unbiased estimates and correct coverage of 
> random effects, when the uncertainty in heterogeneity is large enough 
> relative to its true point estimate, so I wanted to make sure rma.uni 
> produced correct negative variance. When I asked this mailing list 
> about getting negative variance a year ago, James Pustejovsky provided 
> the code (which is also in the metafor
> documentation):
> rma(yi = yi, vi = vi, data=dat, control=list(tau2.min=-min(vi))).
>
> Unfortunately the above code doesn't quite work for me. Here's the 
> line of code for my data:
> rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=Female01, data=simrawnv,
> control=list(tau2.min=-min(vi)))
> I got this error:
> Error in min(vi) : invalid 'type' (closure) of argument.

Blindly copy-pasting code without the necessary adjustments to variable
names isn't going to work.

> And when I tried this...
> rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=Female01, data=simrawnv,
> control=list(tau2.min=-min(YdeltaSEsq)))
> ...I got this...
> Error: object 'YdeltaSEsq' not found.

You can use control=list(tau2.min=-min(simrawnv$YdeltaSEsq)). The tau2.min
argument does not make use of non-standard evaluation.

> But I got it to work with this...
> rma.uni(yi=Ydelta, vi=YdeltaSEsq, mods=Female01, data=simrawnv,
> control=list(tau2.min=-99))
> ...which gave this message...
> Warning message:
> Value of 'tau2.min' constrained to -min(vi) = -0.1640.
> So there's a bug, but it's easy to bypass it meantime.

This is not a bug. If we allow tau^2 to go below -min(vi), then the marginal
variance becomes negative. Neither the all holy SAS(r) nor metafor can
magically make this work.

> The point estimates of the fixed effects in metafor and SAS were 
> identical, but the SE for the intercept (males) in metafor...
>                  estimate      se
> intrcpt    1.3013         0.0013
> mods      1.9245         0.2063
> ...was 0.67 times than in SAS...
> Effect         Estimate  StdErr
> Mean        1.3013     0.001930
> Female01 1.9245     0.2063

This seems quite peculiar. Please provide a fully reproducible example
replicating this issue.

> The SE for heterogeneity in metafor...
> tau^2 (estimated amount of residual heterogeneity):     -0.1640 (SE =
> 0.0735)
> ... was less than half that in SAS...
> CovParm  Estimate  StdErr
> StudyID   -0.1640     0.1636
>
> Should I be disappointed that both SAS and metafor set the point 
> estimate of negative variance to minus the smallest variance of the study
estimates?

No, because it is the only sensible thing that can be done.

> It seems a pretty arbitrary and clunky thing to do, "to ensure that 
> the marginal variances are always non-negative", to quote Wolf in the 
> documentation.

That 'Wolf' guy seems to know what he is talking about here.

> Perhaps someone can explain that. It obviously works as far as 
> coverage is concerned, in SAS anyway.
>
> These problems presumably go away with large-enough numbers of studies 
> and/or sample sizes and/or study-estimate SEs relative to effect 
> magnitudes and/or heterogeneity, but mainly I am working with small 
> numbers of studies, small sample sizes, often large(-ish) 
> study-estimate SEs relative to effect magnitudes, and small heterogeneity.
What to do?  Bootstrap in metafor?

If you think this is going to fix whatever problem you perceive, then here
is some code to get you started:

https://www.metafor-project.org/doku.php/tips:bootstrapping_with_ma

> The main issue for me is whether I will be able to use selmodel to 
> adjust for publication bias, when the study estimates are such that 
> negative heterogeneity could arise purely from sampling uncertainty or 
> from publication bias (which results in underestimation of 
> heterogeneity, in our simulations), and will surely arise in 
> simulations aimed at estimating bias and coverage.

As mentioned above, selmodel() does not allow for negative tau^2 values.