[R-meta] What to do if the residuals are not normally distributed?

Viechtbauer Wolfgang (SP) wolfgang.viechtbauer at maastrichtuniversity.nl
Fri Nov 10 10:39:56 CET 2017


Thanks, the plot came through. Indeed, this looks like a fat-tailed distribution. 

So, you could try:

robust(model, cluster=dat$ExpName)

or:

library(clubSandwich)
coef_test(model, vcov="CR2")

as one potential approach to deal with this. One could also try bootstrapping (http://www.metafor-project.org/doku.php/tips:bootstrapping_with_ma), although for multilevel models, this becomes increasingly more complex to do properly.

If the non-normality comes from the random effects, there are models that allow for non-normal distributions for the random effects; for example:

Baker, R., & Jackson, D. (2008). A new approach to outliers in meta-analysis. Health Care Management Science, 11(2), 121-131.

Beath, K. J. (2016). metaplus: An R package for the analysis of robust meta-analysis and meta-regression. The R Journal, 8(1), 5-16.

Branscum, A. J., & Hanson, T. E. (2008). Bayesian nonparametric meta-analysis using Polya tree mixture models. Biometrics, 64(3), 825-833.

Burr, D., & Doss, H. (2005). A Bayesian semiparametric model for random-effects meta-analysis. Journal of the American Statistical Association, 100(469), 242-251.

Lee, K. J., & Thompson, S. G. (2008). Flexible parametric models for random-effects distributions. Statistics in Medicine, 27(3), 418-434.

With 'metaplus', there is an even an R package for this, but it does not cover multilevel models like you are fitting. For the other approaches, I am not aware of any out-of-the-box implementations. Using BUGS/JAGS/Stan, one could implement things like t-distributed random effects quite easily. No idea whether this would much of a difference though.

Best,
Wolfgang

>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-
>project.org] On Behalf Of Jonas Duus Stevens Lekfeldt
>Sent: Friday, 10 November, 2017 9:09
>To: r-sig-meta-analysis at r-project.org
>Subject: Re: [R-meta] What to do if the residuals are not normally
>distributed?
>
>ATTACHMENT(S) REMOVED: QQnorm_no moderators.pdf
>
>Dear all
>
>Thanks again to Wolfgang for his reply.
>
>I have tried to attach the plot as a PDF. The model is based on 945 data
>pairs and the output from rma.mv() looks like this:
>
>Multivariate Meta-Analysis Model (k = 945; method: REML)
>
>Variance Components:
>
>            estim    sqrt  nlvls  fixed                     factor
>sigma^2.1  0.0316  0.1779    114     no                    ExpName
>sigma^2.2  0.0074  0.0858    290     no        ExpName/ControlName
>sigma^2.3  0.0101  0.1006    945     no  ExpName/ControlName/VarID
>
>Test for Heterogeneity:
>Q(df = 944) = 14276.4311, p-val < .0001
>
>Model Results:
>
>estimate      se    tval    pval   ci.lb   ci.ub
>  0.0893  0.0192  4.6489  <.0001  0.0516  0.1270  ***
>
>---
>Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
>Best regards
>
>Jonas
>
>-----Oprindelig meddelelse-----
>Fra: Viechtbauer Wolfgang (SP)
>[mailto:wolfgang.viechtbauer at maastrichtuniversity.nl]
>Sendt: 08 November 2017 17:19
>Til: Jonas Duus Stevens Lekfeldt; r-sig-meta-analysis at r-project.org
>Emne: RE: What to do if the residuals are not normally distributed?
>
>Can you show us the plot? You can try attaching it, but the R project
>mailing lists are rather picky as to what attachments are allowed (for
>good reasons!). A png/pdf file should usually go through though.
>Otherwise post it somewhere online where we can look at it.
>
>Regardless, assessing (non)normality in more complex mixed-effects models
>is a tricky business. See, for example:
>
>Lange, N., & Ryan, L. (1989). Assessing normality in random effects
>models. Annals of Statistics, 17(2), 624-642.
>
>Nobre, J. S., & Singer, J. M. (2007). Residual analysis for linear mixed
>models. Biometrical Journal, 49(6), 863-875.
>
>In your case, there are the sampling distributions of the estimates
>themselves which may be non-normal (esp. in small samples), but also
>three sets of random effects, and each one of those sets may be non-
>normal. rstandard() gives you 'marginal' residuals, so it is not clear
>where the non-normality is coming from.
>
>Usually though, non-normality is not a major concern when it comes to
>your fixed effects (although the standard errors of the fixed effects may
>be adversely affected). These studies (focused on more simple RE models)
>suggest that non-normality is not a major issue:
>
>Kontopantelis, E., & Reeves, D. (2012). Performance of statistical
>methods for meta-analysis when true study effects are non-normally
>distributed: A simulation study. Statistical Methods in Medical Research,
>21(4), 409-426.
>
>van den Noortgate, W., & Onghena, P. (2003). Multilevel meta-analysis: A
>comparison with traditional meta-analytic procedures. Educational and
>Psychological Measurement, 63(5), 765-790.
>
>More generally, there are also theoretical results that indicate that
>non-normality in mixed-effects models is not a major concern, although we
>may have to adjust how we compute the standard errors of the fixed
>effects:
>
>Verbeke, G., & Lesaffre, E. (1997). The effect of misspecifying the
>random-effects distribution in linear mixed models for longitudinal data.
>Computational Statistics & Data Analysis, 23, 541-556.
>
>This is also backed up by this article:
>
>Maas, C. J. M., & Hox, J. J. (2004). Robustness issues in multilevel
>regression analysis. Statistica Neerlandica, 58(2), 127-137.
>
>Using robust standard errors may help to deal with non-normality.
>
>Best,
>Wolfgang
>
>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-
>project.org] On Behalf Of Jonas Duus Stevens Lekfeldt
>Sent: Wednesday, 08 November, 2017 15:37
>To: r-sig-meta-analysis at r-project.org
>Subject: [R-meta] What to do if the residuals are not normally
>distributed?
>
>Thanks to both Wolfgang and Michael for their replies
>
>I am immensely impressed with the work Wolfgang is doing in sharing his
>knowledge in this forum!
>
>Now I have drawn a qqnorm-plot of my first model using the following
>code:
>
>meta_list <- rma.mv(yi=meta_filter_1$yi,
>                    V=VarC,
>                    random = ~1|ExpName/ControlName/VarID,
>                    test="t")
>
>meta_resid <- rstandard.rma.mv(meta_list) z_val <- meta_resid$z
>qqnorm(z_val)
>qqline(z_val)
>
>Now comes the big problem: the data do not seem normal at all. They
>diverge counterclockwise in both ends (indicating longer tails than the
>normal distribution).
>
>So what to do now? Remove outliers? Is it a serious issue that the
>residuals are not normally distributed or do I not have to worry?
>
>Is there something I am missing?
>
>Best wishes and thanks again
>
>Jonas
>
>-----Oprindelig meddelelse-----
>Fra: Viechtbauer Wolfgang (SP)
>[mailto:wolfgang.viechtbauer at maastrichtuniversity.nl]
>Sendt: 08 November 2017 12:26
>Til: Jonas Duus Stevens Lekfeldt; r-sig-meta-analysis at r-project.org
>Emne: RE: [R-meta] Calculating the var-covar matrix for dependent effect
>sizes for ROM
>
>As far as I can tell, your calculation of the V matrix is correct. If you
>need a reference for the equation for the covariance, see:
>
>Lajeunesse, M. J. (2011). On the meta-analysis of response ratios for
>studies with correlated and multi-group designs. Ecology, 92(11), 2049-
>2055.
>
>One thing to be careful about: Make sure that your dataset is first
>ordered by the variable that you are splitting on with split().
>Otherwise, your V matrix is not ordered in the correct way. To
>illustrate:
>
>dat <- data.frame(study=c(3,3,2,1,1), yi=c(.1,.2,.4,.3,.5),
>vi=c(.02,.01,.07,.05,.03)) dat sav <- lapply(split(dat$vi, dat$study),
>function(x) diag(x, nrow=length(x))) sav
>bldiag(sav)
>
>So do:
>
>dat <- dat[order(dat$study),]
>dat
>sav <- lapply(split(dat$vi, dat$study), function(x) diag(x,
>nrow=length(x))) sav
>bldiag(sav)
>
>Let's just say that this has bitten me in the behind once or twice
>before.
>
>Also, since you are splitting on "ControlName", I assume that values of
>"ControlName" are never the same across different values of "ExpName"
>(otherwise, you would be computing covariances across different
>experiments).
>
>Your rma.mv() code looks fine.
>
>I haven't gotten around to implementing qqnorm() for 'rma.mv' objects.
>But since rstandard() (and rstudent() if you install the devel version of
>metafor) work for 'rma.mv' objects, you can just feed the standardized
>residuals to qqnorm() directly.
>
>Best,
>Wolfgang
>
>-----Original Message-----
>From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-
>project.org] On Behalf Of Jonas Duus Stevens Lekfeldt
>Sent: Wednesday, 08 November, 2017 9:19
>To: r-sig-meta-analysis at r-project.org
>Subject: [R-meta] Calculating the var-covar matrix for dependent effect
>sizes for ROM
>
>Thank you, Wolfgang for the fast and clear reply!
>
>In an earlier reply by Wolfgang I found the following formula for
>calculating the covariance for ROM (log response ratio) when controls are
>shared among some of the datasets (if I have understood it correctly):
>
>Covariance = sd^2/(n*mean^2), from the group whose data is being re-used.
>
>I have calculated a new covariance column in the dataset data of the
>individual effect sizes based on the control data in the following way:
>
>data <- data %>%
>  dplyr::mutate (covar=((data$sd2i)^2)/(data$n2i*(data$m2i^2))),
>
>where:
>sd2i is the standard deviation of the control group n2i is the sample
>size of the control group m2i is the mean of the control group
>
>Subsequently I have calculated the variance-covariance matrix (here
>called VarC) using the following code (again inspired by Wolfgang):
>
>calc.v.ROM <- function(x) {
>  v <- matrix(x$covar[1],nrow=nrow(x),ncol=nrow(x))
>  diag(v) <- x$vi
>  v
>}
>
>covar_list <- lapply(split(data,data$ControlName),calc.v.ROM)
>VarC <- bldiag(covar_list)
>
>Where "ControlName" is the column in "data" where the names of the
>control groups are stored.
>
>Using VarC as the argument to V in the following code gives meaningful
>results so it seems to work, but I would like to ask if it seems correct?
>
>meta_list <- rma.mv(yi=data$yi,
>                    V=VarC,
>                    random = ~1|ExpName/ControlName/ID,
>                    test="t")
>
>Another question: drawing a qqnorm-plot does not seem to be implemented
>for rma.mv(). Is that right?
>
>Best regards
>
>Jonas Duus Stevens Lekfeldt



More information about the R-sig-meta-analysis mailing list