[R-meta] What to do if the residuals are not normally distributed?

Wed Nov 8 15:37:29 CET 2017

Thanks to both Wolfgang and Michael for their replies

I am immensely impressed with the work Wolfgang is doing in sharing his knowledge in this forum!

Now I have drawn a qqnorm-plot of my first model using the following code:

meta_list <- rma.mv(yi=meta_filter_1$yi,
                    V=VarC, 
                    random = ~1|ExpName/ControlName/VarID,
                    test="t")

meta_resid <- rstandard.rma.mv(meta_list)
z_val <- meta_resid$z
qqnorm(z_val)
qqline(z_val)

Now comes the big problem: the data do not seem normal at all. They diverge counterclockwise in both ends (indicating longer tails than the normal distribution).

So what to do now? Remove outliers? Is it a serious issue that the residuals are not normally distributed or do I not have to worry?

Is there something I am missing?

Best wishes and thanks again

Jonas

-----Oprindelig meddelelse-----
Fra: Viechtbauer Wolfgang (SP) [mailto:wolfgang.viechtbauer at maastrichtuniversity.nl] 
Sendt: 08 November 2017 12:26
Til: Jonas Duus Stevens Lekfeldt; r-sig-meta-analysis at r-project.org
Emne: RE: [R-meta] Calculating the var-covar matrix for dependent effect sizes for ROM

As far as I can tell, your calculation of the V matrix is correct. If you need a reference for the equation for the covariance, see:

Lajeunesse, M. J. (2011). On the meta-analysis of response ratios for studies with correlated and multi-group designs. Ecology, 92(11), 2049-2055.

One thing to be careful about: Make sure that your dataset is first ordered by the variable that you are splitting on with split(). Otherwise, your V matrix is not ordered in the correct way. To illustrate:

dat <- data.frame(study=c(3,3,2,1,1), yi=c(.1,.2,.4,.3,.5), vi=c(.02,.01,.07,.05,.03)) dat sav <- lapply(split(dat$vi, dat$study), function(x) diag(x, nrow=length(x))) sav
bldiag(sav)

So do:

dat <- dat[order(dat$study),]
dat
sav <- lapply(split(dat$vi, dat$study), function(x) diag(x, nrow=length(x))) sav
bldiag(sav)

Let's just say that this has bitten me in the behind once or twice before.

Also, since you are splitting on "ControlName", I assume that values of "ControlName" are never the same across different values of "ExpName" (otherwise, you would be computing covariances across different experiments).

Your rma.mv() code looks fine.

I haven't gotten around to implementing qqnorm() for 'rma.mv' objects. But since rstandard() (and rstudent() if you install the devel version of metafor) work for 'rma.mv' objects, you can just feed the standardized residuals to qqnorm() directly.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Jonas Duus Stevens Lekfeldt
Sent: Wednesday, 08 November, 2017 9:19
To: r-sig-meta-analysis at r-project.org
Subject: [R-meta] Calculating the var-covar matrix for dependent effect sizes for ROM

Thank you, Wolfgang for the fast and clear reply!

In an earlier reply by Wolfgang I found the following formula for calculating the covariance for ROM (log response ratio) when controls are shared among some of the datasets (if I have understood it correctly):

Covariance = sd^2/(n*mean^2), from the group whose data is being re-used.

I have calculated a new covariance column in the dataset data of the individual effect sizes based on the control data in the following way:

data <- data %>%
  dplyr::mutate (covar=((data$sd2i)^2)/(data$n2i*(data$m2i^2))),

where:
sd2i is the standard deviation of the control group n2i is the sample size of the control group m2i is the mean of the control group

Subsequently I have calculated the variance-covariance matrix (here called VarC) using the following code (again inspired by Wolfgang):

calc.v.ROM <- function(x) {
  v <- matrix(x$covar[1],nrow=nrow(x),ncol=nrow(x))
  diag(v) <- x$vi
  v
}

covar_list <- lapply(split(data,data$ControlName),calc.v.ROM)
VarC <- bldiag(covar_list)

Where "ControlName" is the column in "data" where the names of the control groups are stored.

Using VarC as the argument to V in the following code gives meaningful results so it seems to work, but I would like to ask if it seems correct?

meta_list <- rma.mv(yi=data$yi,
                    V=VarC, 
                    random = ~1|ExpName/ControlName/ID,
                    test="t")

Another question: drawing a qqnorm-plot does not seem to be implemented for rma.mv(). Is that right? 

Best regards

Jonas Duus Stevens Lekfeldt