[R-meta] Overdispersion in data analyized by metafor. Does one need to correct, if so how can is this accomplished?

Viechtbauer Wolfgang (SP) wolfgang.viechtbauer at maastrichtuniversity.nl
Tue Oct 31 12:54:36 CET 2017

Dear John,

It seems that these are 'paired' data, that is, the first row for type 'general' goes together with the first row for type 'regional' and so on (otherwise the restructuring into the wide format would not make sense). So the first comment I would have is that the data should be analyzed as such. The Poisson, QuasiPoisson, and negative binomial models do not account for this pairing/clustering. A multilevel model should be used instead.

Leaving this issue aside for now, if you want to compare models, you would want to compare the Poisson model with:

dat <- escalc(measure="IRLN", xi=events, ti=SS, data=jdata)
rma(yi, vi, mods = ~ type, data=dat, method="FE")

That is, the Poisson model is conceptually the same as analyzing log transformed incidence rates, with 'type' as a moderator, and using a 'fixed-effects model'. As the Poisson model, this yields a significant 'type' effect.

There appears to be a lot of residual heterogeneity in these data and the 'meta-analytic way' to account for this overdispersion is to add estimate/observation-level random effects to the model. That is, we fit a random/mixed-effects model with:

rma(yi, vi, mods = ~ type, data=dat)

As the QuasiPoisson and negative binomial models, this yields a non-significant 'type' effect. So, to answer your question: In metafor -- or more generally, in meta-analysis -- overdispersion can be accounted for by using random/mixed-effects models.

But in that sense, neither the QuasiPoisson nor the negative binomial model is conceptually 100% analogous to what happens in the meta-analytic model. The most direct comparison would be using a Poisson mixed-effects model, again with observation level random effects:

jdata$id <- 1:nrow(jdata)
res <- glmer(events ~ type + (1 | id), family=poisson(link=log), offset=log(SS), data=jdata)

You will find the results from this model to match very closely with those from the meta-analytic model.

To come back though to the earlier point: I think one would want to use a model that accounts for the pairing/clustering. For example:

jdata$pair <- 1:16
res <- glmer(events ~ type + (1 | pair/id), family=poisson(link=log), offset=log(SS), data=jdata)

The analogous meta-analysis model would be:

dat <- escalc(measure="IRLN", xi=events, ti=SS, data=jdata)
rma.mv(yi, vi, mods = ~ type, random = ~ 1 | pair/id, data=dat)

Again, very similar results.

Alternatively, you could use the wide-format dataset and then analyze log incidence rate ratios:

dat <- escalc(measure="IRR", x1i=regionalevents, t1i=regionaln, x2i=generalevents, t2i=generaln, data=datawide)
rma(yi, vi, data=dat)

Pretty much the same results (the coefficient is the wrong sign, since in your code for restructuring, you mixed up the 'regional' and 'general' columns).

Interestingly, there is now a significant 'type' effect BUT IN THE OPPOSITE DIRECTION. Well, hello there, Simpson's paradox! So, the large majority of the heterogeneity/overdispersion is between pairs and relatively little within pairs. Therefore, when comparing 'alike with alike' (i.e., within pairs), we do detect the significant effect of 'type' and in the appropriate direction.

For the QuasiPoisson or negative binomial models, you could add 'pair' as a factor:

model2=glm(events~type + factor(pair),family=quasipoisson(link=log),offset=log(SS),data=jdata)
model3 <- glm.nb(events~type+factor(pair)+offset(joffset),link=log,data=jdata,weights=jdata[,"weight"])

That leads to the same conclusion.


Wolfgang Viechtbauer, Ph.D., Statistician | Department of Psychiatry and 
Neuropsychology | Maastricht University | P.O. Box 616 (VIJV1) | 6200 MD 
Maastricht, The Netherlands | +31 (43) 388-4170 | http://www.wvbauer.com 

-----Original Message-----
From: Sorkin, John [mailto:jsorkin at som.umaryland.edu] 
Sent: Tuesday, 31 October, 2017 0:49
To: Wolfgang Viechtbauer
Cc: r-sig-meta-analysis at r-project.org
Subject: Overdispersion in data analyized by metafor. Does one need to correct, if so how can is this accomplished?



I thank  Wolfgang Viechtbauer for the work he has done, and the time he has devoted to the metafor package.

I hope you will allow me to ask what may seem like two simple questions. My questions, more completely described below are (1) Is it necessary to account for over-dispersion when using  metafor? (2) If it is necessary, how does one do so. My data, and R code are attached to this email as a text document (as a .R file).

I have data describing two groups, one labeled "regional" and a second "general". Before I discovered the metafor package, I used glm with Poisson regression to determine if the event rates differed between my two groups. Poisson regression give a significant  difference, but the ratio of the residual variance to the degrees of freedom is far from one. I re-ran the glm using quaisPoisson [family=quasipoisson(link=log)] and got a no significant difference between groups. I then ran a negative binomial analysis and to make sure the finding of excessive deviance was correct (deviance 124, DF=30, ratio apprxomatly 4).

After I discovered metafor, I re-ran the analyses and get a significant group effect. I think the over-dispersion I found in the Poission and negative binomial regressions may be causing an estimation problem for metafor. If over-dispersion is a problem, I don't know how I can have metafor account for the over-dispersion (as I did with the glm by using quasipoisson and negative binomial rather than poisson regression).

I thank you for your time and help.

Thank you,

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

More information about the R-sig-meta-analysis mailing list