[R-meta] order of effect sizes in data file changes results

Sun Dec 30 16:27:25 CET 2018

Dear Anna,

I suspect that the 'V' matrix (in your case, berkeyV) is not aligned with the data (yi). For example:

library(metafor)
dat <- dat.berkey1998
dat

V <- bldiag(lapply(split(dat[,c("v1i", "v2i")], dat$trial), as.matrix))
V

# one can see here that the 2x2 blocks along the diagonal of V are in fact the variances and covariances as given by the variables 'v1i' and 'v2i' in 'dat'

rma.mv(yi, V, mods = ~ outcome - 1, random = ~ outcome | trial, struct="UN", data=dat)

# but now let's change the order of the data

myorder <- order(dat$author)
dat <- dat[myorder,]
dat

# since the V matrix is unchanged, it is now not in the correct order anymore, so the following results are nonsense

rma.mv(yi, V, mods = ~ outcome - 1, random = ~ outcome | trial, struct="UN", data=dat)

# so we have to order the V matrix in the same way as the data

V <- V[myorder, myorder]

rma.mv(yi, V, mods = ~ outcome - 1, random = ~ outcome | trial, struct="UN", data=dat)

# same results as in the beginning

Also, one has to be careful when using things like split(). It will order the splits by the splitting variable:

dat
split(dat, dat$trial)

Note that the order of the data in 'dat' and the splits are not in the same order. This can also lead to a misalignment. So, to be safe, first order the data by the variable that will be used in split() (as in the very beginning, where the data are already ordered by 'trial').

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Van Meter, Anna
Sent: Sunday, 30 December, 2018 4:43
To: r-sig-meta-analysis using r-project.org
Subject: [R-meta] order of effect sizes in data file changes results

Hello,

As described in a previous post, I am conducting a meta analysis of bipolar disorder prevalence rates. Some studies report multiple prevalence rates; for example, one study could report the prevalence for bipolar I and for the full bipolar spectrum, which would include people with bipolar I, plus other people who have other subtypes of bipolar disorder. There are three potential prevalence categories: bipolar I, bipolar I & II, all bipolar.

I am using the Berkey approach to account for the overlap in effect sizes and, based on a helpful response, have set the model up as follows to estimate the average prevalence for each subtype (threegroup is a dummycode for the subtype, articleno is the study ID):

resmvberkeyhybrid<-rma.mv(yi, berkeyV, mods = ~ threegroup, random = ~ threegroup | articleno, struct="UN", method="ML",data=kidtall1, digits=4)

My question:

I have noticed that the results of the model change depending on how the .csv data file is ordered, why would the order of the data file matter? And, what is the correct way to order the data file? My guess would be first by articleno and then by threegroup.

Thank you for your help!

Best,
Anna