[R] metafor package: study level variation

Mon Sep 10 10:52:22 CEST 2012

As usual, Michael was faster than I in responding. Let me add a few thoughts of my own. See comments below in text.

Best,
Wolfgang

--   
Wolfgang Viechtbauer, Ph.D., Statistician   
Department of Psychiatry and Psychology   
School for Mental Health and Neuroscience   
Faculty of Health, Medicine, and Life Sciences   
Maastricht University, P.O. Box 616 (VIJV1)   
6200 MD Maastricht, The Netherlands   
+31 (43) 388-4170 | http://www.wvbauer.com   

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On Behalf Of Jarrett Byrnes
> Sent: Friday, September 07, 2012 16:02
> To: R help
> Subject: [R] metafor package: study level variation
> 
> Hello.  A quick question about incorporating variation due to study in the
> metafor package.  I'm working with a particular data set for meta-analysis
> where some studies have multiple measurements.  Others do not.  So, let's
> say the effect I'm looking at is response to two different kinds of drug
> treatment - let's call their effect sizes T1 and T2.  Some studies have
> multiple experiments measuring  T1 and T2.  Some have one of each.  Some
> only have T1 or T2.

I assume there is also a control group/condition in each of these studies, so in other words, you have a bunch of studies where some are two-arm studies comparing Trt1 *or* Trt2 to control and some are three-arm studies comparing both Trt1 *and* Trt2 to control.

> Now, in metafor, I've been using
> 
> rma(yi = logRatio, vi=varLogRatio, mods=~ Drug.Type, data=mydata)

So, drug.type is a dummy variable (either Trt1 or Trt2), so the code above will fit the model:

yij = beta0 + beta1 Trt2 + uij + eij,

where yij is the jth observed outcome in the ith study, beta0 then corresponds to the (average) outcome for Trt1, beta1 indicates how much higher or lower the (average) outcome is for Trt2 compared to Trt1, uij ~ N(0, tau^2), and eij ~ N(0, varLogRatio). This model will treat three-arm studies as if they were two (independent) two-arm studies. Probably not ideal.

> This works fine.  Out of curiosity, I ran a quickie model in lme4
> 
> lmer(logRatio ~ Drug.Type + (1+studyID), data=mydata, weights=varLogRatio)
> 
> and I noticed that the results are quite different, and this appears due
> to some variation due to study (after inspecting ranef - note, I included
> Drug.Type as a fixed effect as there were only two levels).

1) Did you use (1+studyID) or (1 | studyID)? The latter is probably what you meant/want to use.
2) You need to specify the *inverse* of the variances as weights.
3) This model assumes that the sampling variances are known up to a proportionality constant, not exactly known. You will therefore get what is sometimes called a multiplicative model for heterogeneity, with heterogeneity reflected in a residual variance estimate larger than 1. This model is different from the additive model (which is typically used), where the sampling variances are assumed to be known exactly and we *add* an additional random effect to reflect heterogeneity.

So, with (1 | studyID) and inverse sampling variance weights, you get the model:

yij = beta0 + beta1 Trt2 + ui + eij,

where ui ~ N(0, tau^2), eij ~ N(0, sigma^2 * varLogRatio). Now tau^2 reflects study-level variability and sigma^2 reflects multiplicative heterogeneity.

> So, I went back to metafor and ran
> 
> rma(yi = logRatio, vi=varLogRatio, mods=~ Drug.Type+studyID, data=mydata)
> 
> which yielded the error
> 
> Error in qr.solve(wX, diag(k)) : singular matrix 'a' in solve
> In addition: Warning message:
> In rma(yi = logRatio, vi = varLogRatio, data = mydata, mods = ~Drug.Type
> :
>   Cases with NAs omitted from model fitting.
> 
> which appears to be due to the unbalanced nature of the dataset (some
> studies having T1 and T2, some having multiple measures of T1 and T2).

I would have to see:

with(mydata, model.matrix(~ Drug.Type + studyID))

to figure out what is going on here. The error indicates that you have some linear dependency between the columns of the design matrix. That should not happen based on what you describe. For example, suppose there are 4 studies, the first and fourth a three-arm studies, the second only examines Trt1 and the third only Trt2. Then:

> study <- factor(c(1,1,2,3,4,4))
> trt <- factor(c(1,2,1,2,1,2))
> model.matrix(~ trt + study)
  (Intercept) trt2 study2 study3 study4
1           1    0      0      0      0
2           1    1      0      0      0
3           1    0      1      0      0
4           1    1      0      1      0
5           1    0      0      0      1
6           1    1      0      0      1

which is of full rank. That should be true regardless of how many studies (of each type) I add.

> So, is there a way to properly incorporate studyID in a metafor using rma?
> Is there an argument I'm missing, or perhaps should be using a different
> function?

At the moment, metafor is really only a set up for univariate models (that should change in the near? future). The kind of multilevel/multivariate structure you are dealing with will require (at the moment) other tools.

Note that there is an additional issue with your data: If logRatio reflects the difference between Trt1 or Trt2 and Control, then in three-arm studies the two logRatio values are dependent since the data from the control group/condition is used twice. Note that this is statistical dependence over and beyond what is induced by potentially correlated true effects within the three-arm studies. See chapter 19 in the Handbook of Research Synthesis and Meta-Analysis (i.e., the 2nd ed).

As pointed out by Michael, you are essentially in a network meta-analysis type of situation. You may want to take a look at the following article for more details:

Salanti, G., Higgins, J. P. T., Ades, A. E., & Ioannidis, J. P. A. (2008). Evaluation of networks of randomized trials. Statistical Methods in Medical Research, 17(3), 279-301.

> Thanks!
> 
> -Jarrett