[R-meta] sandwhich estimator with geeglm object

Wed May 3 13:45:24 CEST 2023

You can also just use lme4::glmer() directly. Except for model="CM.EL", that is what rma.glmm() uses in the background by default. Other package options are glmmTMB and GLMMadaptive. This can handle multilevel structures. On the other hand, if the same participants were measured on multiple dichotomous response variables (or at multiple time points), then things are more tricky, since in principle one would need the IPD data to model this correctly. If one just has the aggregated counts for each response variable, then RVE might work (either with a multilevel logistic model or GEE).

Best,
Wolfgang

>-----Original Message-----
>From: Yefeng Yang [mailto:yefeng.yang1 using unsw.edu.au]
>Sent: Wednesday, 03 May, 2023 13:17
>To: Viechtbauer, Wolfgang (NP); R Special Interest Group for Meta-Analysis
>Cc: James Pustejovsky
>Subject: Re: [R-meta] sandwhich estimator with geeglm object
>
>Dear both,
>
>Comment 1 from Wolfgang is exactly what I want to ask about within-study
>correlation.
>
>The reason why I ask about the GEE and the within-study correlation is that I did
>not find a solution for dependent 2×2 table data (or more generally,  non-normal
>outcome with correlated error). metafor::rma.glmm works only for independent
>data. I am thinking whether we can use weighted GEE to resolve this issue. I
>acknolwdge the beta coefficient from GEE is different from GLMM (the former is a
>marginal model giving a population-average estimate and the latter is a
>conditional model giving a subjective-specific estimate). Using GEE + small
>sample adjustment or GLM + small sample RVE seems the perfect solution for this.
>Any comment on this?
>
>Best,
>Yefeng
>________________________________________
>From: Viechtbauer, Wolfgang (NP) <wolfgang.viechtbauer using maastrichtuniversity.nl>
>Sent: Wednesday, 3 May 2023 17:34
>To: R Special Interest Group for Meta-Analysis <r-sig-meta-analysis using r-
>project.org>; Yefeng Yang <yefeng.yang1 using unsw.edu.au>
>Cc: James Pustejovsky <jepusto using gmail.com>
>Subject: RE: [R-meta] sandwhich estimator with geeglm object
>
>Just a few comments from my side:
>
>1) While rma.mv() is not designed to estimate a parameterized V matrix, one can
>easily do this manually. I think this is what Yefeng meant with "iterative way to
>find the maximum likelihood". For example, if we assume a constant correlation
>between the sampling errors within studies, one can just repeatedly construct the
>corresponding V matrix (e.g., with vcalc()), fit the model with rma.mv(), extract
>the log likelihood (with logLik()), and find the value of the correlation that
>maximizes the log likelihood.
>
>A special case of this is the so-called 'Riley model' where the within- and
>between-study correlations are assumed to be identical. This can be fitted in the
>same manner, since rma.mv() also allows fixing variance/correlation components
>(so in the iterative steps above, the same correlation is used to construct V and
>in rma.mv(..., rho=<value>).
>
>Indeed, I have played around with this for a couple examples. When allowing the
>within- and between-study correlations to differ, then one often runs into the
>issue that one of the two correlations wants to drift towards a bound (+-1). With
>enough data, it is possible though to fit such a model.
>
>2) As for using moment estimators versus (RE)ML: Likelihood-based methods are
>often quite robust to violations of the distributional assumptions, especially
>when it comes to estimating the fixed effects and their SEs. For constructing
>prediction intervals, distributional assumptions are more crucial, but this is a
>step after the model fitting and will affect both methods equally unless one uses
>non-parametric methods for constructing such intervals.
>
>Best,
>Wolfgang
>
>>-----Original Message-----
>>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On
>>Behalf Of James Pustejovsky via R-sig-meta-analysis
>>Sent: Tuesday, 02 May, 2023 21:25
>>To: Yefeng Yang
>>Cc: James Pustejovsky; R Special Interest Group for Meta-Analysis
>>Subject: Re: [R-meta] sandwhich estimator with geeglm object
>>
>>Responses below.
>>
>>On Mon, May 1, 2023 at 10:27 PM Yefeng Yang <yefeng.yang1 using unsw.edu.au>
>>wrote:
>>
>>> BTW, what is your opinion of using this interactive way (moment estimator)
>>> to estimate within-group correlations (e.g., auto-correlation terms). I
>>> mean likelihood approach won't work if the distribution is misspecified
>>> (because we do not know the within-group correlations and we need to
>>> estimate it in geepack::geeglm()). This is a sort of dilemma.
>>
>>I don't have any strong prior. I would guess that the performance of these
>>variance component estimators (or more generally, dependence structure
>>estimators) could be rather specific to the context you're looking at. For
>>instance, it might depend on whether the data are balanced repeated
>>measures or imbalance with unequal numbers of repeated measures, on the
>>number of independent clusters, etc. Hard to say. Doing some monte carlo
>>simulations could help sort out what method to use.
>>
>>> If we put this in the context of meta-analysis, metafor::rma.mv does not
>>> allow for estimating within-group correlations from data, although it can
>>> by using an iterative way to find the maximum likelihood. What do you think?
>>
>>If I follow what you mean, I think you're talking about trying to estimate
>>rho (the correlation between sampling errors) instead of just imputing a
>>value of rho.  If I recall, Wolfgang and I have both toyed with this idea a
>>bit. It's feasible to do this using nlme::lme(). The problem is that the
>>REML estimation method in nlme when you have a fixed variance structure at
>>level 1 is still a bit suspect---might have a bug in it, since it doesn't
>>agree with metafor::rma.mv().
>>
>>> Cheers,
>>> Yefeng
>>>
>>> ------------------------------
>>> *From:* James Pustejovsky <jepusto using gmail.com>
>>> *Sent:* Tuesday, 2 May 2023 12:39
>>> *To:* Yefeng Yang <yefeng.yang1 using unsw.edu.au>
>>> *Cc:* R Special Interest Group for Meta-Analysis <
>>> r-sig-meta-analysis using r-project.org>
>>> *Subject:* Re: [R-meta] sandwhich estimator with geeglm object
>>>
>>> Hi Yefeng,
>>>
>>> Yes, this is correct if we're speaking on a very broad level. Broadly,
>>> geepack::geeglm() with an identity link function fits the same type of
>>> model as nlme::gls(), but geepack::geeglm() uses sandwich variance
>>> estimation whereas nlme::gls() uses model-based variance estimation.
>>> Running nlme::gls() with clubSandwich::vcovCR() will give you sandwich
>>> variance estimation. The CR1 and CR3 sandwich estimators can be calculated
>>> from both packages for comparison purposes.
>>>
>>> There is one other difference that should be noted though:
>>> geepack::geeglm() uses moment estimators for the parameters of the working
>>> model (such as variance coefficients, auto-correlation terms, etc.). In
>>> contrast, nlme::gls() uses maximum likelihood or restricted maximum
>>> likelihood estimation for these parameters. This difference in estimation
>>> strategies can lead to numerical differences in the model coefficient
>>> estimates calculated by the two packages  (and therefore also differences
>>> in the sandwich standard errors generated by geepack::geeglm versus by
>>> clubSandwich::vcovCR.gls).
>>>
>>> James
>>>
>>> On Mon, May 1, 2023 at 7:19 PM Yefeng Yang <yefeng.yang1 using unsw.edu.au>
>>> wrote:
>>>
>>> Dear James,
>>> Probably, it is more clear if I ask "whether glm + RVE is equal to gee?".
>>> gee is doing RVE on the fly, while glm + RVE uses a post-hoc way to
>>> calculate robust errors when dealing with correlated errors (e.g., repeated
>>> measurements).
>>> Best,
>>> Yefeng