[R-meta] Question about meta-regression with multiple treatment studies

Thu May 16 11:04:24 CEST 2019

Dear André,

To start, the second argument of rma.uni() and rma.mv() is for the sampling *variances* (and for rma.mv(), it can even be an entire matrix that contains variances and covariances of the sampling errors), but you use 'V = pHmean_SE', so it seems that you are passing standard errors (i.e., the square root of the sampling variances) to the function. So, you really should use: 

Data$pHmean_VAR <- pHmean_SE^2

res <- rma.mv(yi = pHmean, 
              V = pHmean_VAR,
              ...

Then for lmer() and lme(), you should also use 'weights = 1/pHmean_VAR' and 'weights = varFixed(~pHmean_VAR)' (note that 'varFixed' is for fixing the *variances*).

In principle, lme() with 'control = lmeControl(sigma = 1)' should then give you the same results as rma.mv() (within numerical tolerances expected when using numerical methods), but it will not. As I note at the bottom of the page you linked to (http://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer), there is some issue with lme() when using 'control = lmeControl(sigma = 1)'. Results should be close, but not the same. I cannot tell you why this is, but I can guarantee you that the results from rma.mv() are correct (as they have been compared with several other R/software packages). lmer() of course doesn't allow you to fix that multiplicative factor to 1, so it is really fitting a different model.

Furthermore, the model needs an estimate level random effect. Please take a look at:

http://www.metafor-project.org/doku.php/analyses:konstantopoulos2011

and especially the section "A Common Mistake in the Three-Level Model". So, you should use:

Data$id <- 1:nrow(Data)

random = ~ 1 | study/id

By the way, the 'struct = "UN"' is irrelevant (the 'struct' argument only matters for '~ inner | outer' random effects terms).

But even that is not entirely sufficient in my opinion. If the same cows receive all treatments within a study, then the sampling errors are correlated, which one could capture by specifying appropriate sampling error covariances in the V matrix. Unfortunately, those covariances are unlike to be reported (you would need the entire var-cov matrix of those LS means, not just the standard errors). The approach (with 'random = ~ 1 | study/id') could be called the 'multilevel approach to multivariate estimates', which has been discussed quite a bit on this mailing list in the past. It usually should work ok (the 'study' level variance component will become inflated by assuming that V is diagonal, hopefully capturing the covariances among the sampling errors that are being ignored). See:

Moeyaert, M., Ugille, M., Beretvas, S. N., Ferron, J., Bunuan, R., & Van den Noortgate, W. (2017). Methods for dealing with multiple outcomes in meta-analysis: A comparison between averaging effect sizes, robust variance estimation and multilevel meta-analysis. International Journal of Social Research Methodology, 20(6), 559-572.

Van den Noortgate, W., Lopez-Lopez, J. A., Marin-Martinez, F., & Sanchez-Meca, J. (2013). Three-level meta-analysis of dependent effect sizes. Behavior Research Methods, 45(2), 576-594.

Van den Noortgate, W., Lopez-Lopez, J. A., Marin-Martinez, F., & Sanchez-Meca, J. (2015). Meta-analysis of multiple outcomes: A multilevel approach. Behavior Research Methods, 47(4), 1274-1294.

Two additional approaches are:

1) Guestimate/assume a correlation among the sampling errors, compute the covariances (based on cor * SE1_1 * SE_2) (the impute_covariance_matrix() function from the 'clubSandwich' package is very useful for that), then use that as the V matrix, and repeat for various correlation values as a sensitivity analysis.

2) Either with or without step 1), use cluster-robust inferences methods. So fit the model assuming V is diagonal and then follow this up with:

robust(res, cluster=Data$study)

or use

coef_test(res, vcov = "CR2")

from clubSandwich, which does some fancy small-sample adjustments for computing the cluster-robust var-cov matrix of the fixed effects and then uses a Satterthwaite approximation for constructing approximate t-tests (James Pustejovsky, the author of clubSandwich, will chime in here if I said something stupid).

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces using r-project.org] On Behalf Of Mensching, Andre
Sent: Thursday, 16 May, 2019 9:54
To: r-sig-meta-analysis using r-project.org
Subject: [R-meta] Question about meta-regression with multiple treatment studies

Dear all experts in meta-analaysis,

I'm currently working on a meta-analysis in agricultural science. Unfortunately, I am very uncertain about the method to be used. After recently submitting a paper, I received major revisions and the advice to use the metafor package and in particular the rma.mv function. So far I only used linear mixed models and the lmer function from the lme4 package in R and the “inverse-variance” method.

I think it would be best to briefly outline the objective and describe the data structure:
I want to do a meta regression on the subject of metabolic diseases in dairy cows. The target variable is the daily mean pH value (= continues outcome) over a period of e.g. 48 h in the forehead system. The published studies are mostly based on replicated Latin Square experimental designs. For example 8 cows are split over 4 different treatments (= 2 x 2 factorial design with different diets, a "control group" is not necessarily present) within 1 of 4 periods. In the different periods the cows receive different treatments, thus each of the 8 cows is exposed to all treatments.  
The studies provide LSmean estimates per treatment for the mean pH as well as the standard error of these estimates (not the standard deviation!). In addition, information on treatments in form of the feed composition and feed analyses are provided. This information as well as milk constituents, for example, are to be examined as moderator variables.

Here is an example for the data structure in the long-format (simulated random numbers, DMI (=Dry Matter Intake) and NDF (Neutral Detergent Fiber) are moderators for the diet, additionally the milk fat content):

study pHmean pHmean_SE n_cows       DMI      NDF      fat
        1           6.09           0.05                     8    23.75    18.88     4.01
        1           6.08           0.05                     8    23.60    17.71     3.93
        1           5.84           0.05                     8    23.16    18.51     3.94
        1           5.96           0.05                     8    25.18    17.93     3.91
        2           5.76           0.09                   16    25.98    19.53     3.91
        2           6.00           0.09                   16    23.29    19.30     4.01
        2           6.13           0.09                   16    27.72    20.57     3.97
        2           5.92           0.09                   16    22.17    18.61     3.99
        3           5.98           0.09                     8    26.50    19.54     3.91
        3           6.00           0.09                     8    24.76    20.74     3.95
        3           6.12           0.09                     8    23.35    20.58     4.02
        3           5.86           0.09                     8    27.76    19.73     4.08 
       …               ...                …                    ...           …           ...        …  

I think that it is necessary to work with multiple mixed-effects models and to consider the raw means as effect size of each treatment. 

As already mentioned, so far I used lmer() [study coded as factor!) :

LMM <- lmer(pHmean ~ DMI + NDF + fat + (1|study), 
            weights = 1/pHmean_SE, 
            REML = TRUE, 
            data = Data)

I also tried the lme function from the package nlme:

LMM2 <- lme(pHmean ~ DMI + NDF + fat ,                               
            random = ~ 1 | study,
            weights = varFixed(~pHmean_SE),
            control = lmeControl(sigma = 1),
            method = "REML",
            data = Data)

My first approach with the rma.mv function is:

res <- rma.mv(yi = pHmean, 
              V = pHmean_SE,
              mods = ~ DMI + NDF + fat,
              random = ~ 1 | study,
              method = "REML",
              struct = "UN",  
              data = Data)

To the theme is also already a statement:
http://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer 

I can confirm that if nlme and lmer return the same result, if the control argument is not set to lmeControl(sigma = 1). If I set the argument control=lmeControl(sigma = 1), I get similar results to those of the rma.mv function (but not the same!).

During my literature research on studies with meta-analyses using a comparable data structure, I only found evaluations using either lmer (R) or proc mixed (SAS) with weights = 1/SEM  or  1/SEM^2. 

For example:
Roman-Garcia, Y., R.R. White, and J.L. Firkins. 2016. Meta-analysis of postruminal microbial nitrogen flows in dairy cattle. I. Derivation of equations. J. Dairy Sci. 99:7918–7931. http://dx.doi.org/10.3168/jds.2015-10661.
Santos, J.E.P., I.J. Lean, H. Golder, and E. Block. 2019. Meta-analysis of the effects of prepartum dietary cation-anion difference on performance and health of dairy cows. J. Dairy Sci. 102:2134–2154. http://dx.doi.org/10.3168/jds.2018-14628.
White, R.R., M.B. Hall, J.L. Firkins, and P.J. Kononoff. 2017. Physically adjusted neutral detergent fiber system for lactating dairy cow rations. I: Deriving equations that identify factors that influence effectiveness of fiber. J. Dairy Sci. 100:9551–9568. http://dx.doi.org/10.3168/jds.2017-12765.

To summarize the problems again: 
I actually used a method that others have used and even recently was published.
Nevertheless I was advised to use the metafor package with the rma.mv function, because it should be more appropriate. 
And yes, the results differ of course. So I do not know what is right and what is not. 

What do you think about that?

I would be very happy to hear from someone. Thank you very much in advance!

Best regards,
André