[R-meta] network meta-analysis - include block (within-study) level

Thu Aug 10 20:16:42 CEST 2017

Thanks Wolfgang for the effort to understand my question and sorry for my
confusing texts.

Just to try to clarify the concepts we've been talked for the list members:

My purpose was to estimate the soybean grain yield difference between
several fungicide treatments and the untreated check.

A 5 years (66 trials) data set was available for that where each trial was
a field soybean plots experiment with 4 or 5 blocks. Yield was assessed at
maturity crop stage and expressed in kg/ha.

This is the full dataset structure.

trt        trial   bk    yield
------------------------------------
Check   3       1     2493
Check   3       2     2173
Check   3       3     2628
Check   3       4     2168
Fox       3       1     3194
Fox       3       2     2363
Fox       3       3     2887
Fox       3       4     3278
NTX      3       1     2988
NTX      3       2     2361
NTX      3       3     2341
NTX      3       4     3218

Since the dataset contains different sets of treatments through the years I
considered using the network M-A approach.

Trials->    1  2  3  4 11 13 14 38 39 41 42 60 62 63 65 66
Check      1  1  1  1  1  1   1   1   1   1   1   1   1   1   1   1
Fox          0  0  0  0  0  0   0   1   1   1   1   1   1   1   1   1
NTX         1  1  1  1  1  1   1   1   1   1   1   0   0   0   0   0
EpoFlux   1  1  1  1  1  1   1   1   1   1   1   1   1   1   1   1

I followed some analysis methods published in papers from the
phytopathology-plant disease epidemiology area and I used the mean
treatment values within each trial and used the MSE from each trial anova
as variances. (A kind of two-stage analysis).
(http://apsjournals.apsnet.org/doi/abs/10.1094/PHYTO-97-2-0211)
<http://apsjournals.apsnet.org/doi/abs/10.1094/PHYTO-97-2-0211>

The data structure I meta-analyzed was:

trt       trial   n   yield_m     MSE           vi2
------------------------------------------------------------------
Check   3     4   2365.5      88931.9  22233
Fox       3     4   2930.5      88931.9  22233
NTX      3     4   2727.0      88931.9  22233

where vi2= MSE/n

However, since I have the raw full dataset, I wonder if it would be more
suitable to analyze the data in a one-stage analysis including blocks in
the model.

Wolfgang kindly pointed out all the thoughts included in the mail below.

Wolfgang: I really appreciate your answer and thanks for sharing your
knowledge. I will try your suggested approach.

Best wishes,

*Juan Edwards===========================PhD candidate - Plant Pathology
DepartmentUniversity of Sao Paulo / Brazil*

On Thu, Aug 10, 2017 at 5:45 PM, Viechtbauer Wolfgang (SP) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:

>
> I think I now have a general understanding of your data structure and what
> you are doing. Some final thoughts:
>
> 1) If you include block as an effect in the ANOVAs, you must include block
> in whatever model you are going to fit afterwards. In particular, by
> including block as an effect in the ANOVAs, you are removing that source of
> variability from the residual variance. But the variance of a raw yield
> value is sigma^2_block + sigma^2_resid. By using the MSE for the variance,
> the sampling variance will only reflect sigma^2_resid, so you need to
> account for the block level variance in whatever model you fit afterwards.
>
> 2) Using
> 
> MSE/n (with n = 4/5) only makes sense if you are averaging within blocks
> (which you seem to be doing at the end). But even that is only approximate.
> Multiple yield values within the same block are correlated sigma^2_block /
> (sigma^2_block + sigma^2_resid) -- but MSE/n assumes independence.
>
> 3) Below, you end up constructing a dataset with yield values averaged
> within blocks. Now I don't understand the original question anymore,
> because in such a dataset, one cannot include block as another random
> effect. Also, see 1) -- in such a dataset, you cannot account for the block
> level variance.
>
> I still think you may want to analyze your data (not aggregated in any
> way) with a mixed-effects model directly, allowing for trial and block
> level variance (plus residual variance). In this case, you are making
> things more difficult by trying to do a two-stage analysis.
>
> Best,
> Wolfgang
>
> --
> Wolfgang Viechtbauer, Ph.D., Statistician | Department of Psychiatry and
> Neuropsychology | Maastricht University | P.O. Box 616 (VIJV1) | 6200 MD
> Maastricht, The Netherlands | +31 (43) 388-4170 | http://www.wvbauer.com
>
> -----Original Message-----
> From: Juan Pablo Edwards Molina [mailto:edwardsmolina at gmail.com]
> Sent: Wednesday, August 09, 2017 19:47
> To: Viechtbauer Wolfgang (SP)
> Cc: r-sig-meta-analysis at r-project.org
> Subject: Re: [R-meta] network meta-analysis - include block (within-study)
> level
>
> | Okay, so let me see if I understand. What you are showing below are
> actually the raw data from trial 3. And you have more trials of
> | that type (either with 4 or 5 blocks and treatments may differ slightly
> across trials, but all trials have 'Check'). So now you want to
> | meta-analyze those yield values, including 'treatment' as the predictor
> of interest (and accounting for the nested structure of the
> | data).
>
> Exactly Wolfgang, this is a sample of the treatments sets
>
> > table(dat$trt, dat$study)
>
> 
> Trials------->          1  2  3  4 11 13 14 38 39 41 42 60 62 63 65 66
>
> AA_CHECK           1  1  1  1   1   1   1   1   1   1   1   1   1   1   1
>   1
> AZ_BF                 0  0  0  0   0   0   0   1   1   1   1   1   1   1
> 1   1
> CZM                    1  1  1  1   1   1   1   1   1   1   1   0   0   0
>   0   0
> EPO_FLUX_PYRA   1  1  1  1   1   1   1   1   1   1   1   1   1   1   1   1
> FLUX_PYRA          0  0  0  0   1   1   1   1   1   1   1   1   1   1   1
>   1
> MZB                    0  0  0  0   0   0   0   1   1   1   1   1   1   1
>   1   1
> PROT_TRIF           1  1  1  1   1   1   1   1   1   1   1   1   1   1   1
>   1
>
> Te diferent sets of treatments was  the reason to led me to think about a
> network meta-analysis.
>
> vi2 = MSE/n # multivariate approach for Sampling variance of yield (n=4
> or 5)
>
> dat$fungic <- relevel(factor(yield_dat$fungic), ref="AA_CHECK")
>
> In order to do so, you need a variance of the yield values. Obviously,
> you do not have a variance per row, since each row is a single measurement.
> So, you fitted some ANOVA model to these data (per trial) in order to
> obtain the MSE and then want to use that as the variance for all yield
> values from that trial.
>
> yes, I used the treatment means within the trials and that MSE as trial
> variance.
>
> But what is 'n' in V_yield/n (i.e., MSE/n)? Seems to me that MSE *is* the
> variance you want. For example:
>
> 
> 
> dat <- read.table(header=TRUE, text = "
> trt     trial   bk    yield
> Check   3       1     2493
> Check   3       2     2173
> Check   3       3     2628
> Check   3       4     2168
> Fox     3       1     3194
> Fox     3       2     2363
> Fox     3       3     2887
> Fox     3       4     3278
> NTX     3       1     2988
> NTX     3       2     2361
> NTX     3       3     2341
> NTX     3       4     3218")
>
> res <- lm(yield ~ trt + factor(bk), data=dat)
> summary(res)
> 
> sigma(res)^2 ### error variance
>
> ### or treating 'bk' as random, but this yields the same results
>
> library(nlme)
> res <- lme(yield ~ trt, random = ~ 1 | bk, data=dat)
> summary(res)
> sigma(res)^2 ### error variance
>
> Using aov() would require restructuring the data, but will also yield the
> same results.
>
> But if you are doing all of that anyway, why not just analyze ALL of the
> data that way, adding trial as another random effect?
>
> Yes I did that for each treatment so final data set was:
>
> 
>     trial  year   bk           trt                yield
> 
>   MSE    vi2
> 1       1 2012   4    AA_CHECK          2640
> 
> 88931.9   22233
> 2       1 2012   4    CZM_CM_TEBU    2337   88931.9   22233
> 3       1 2012   4    CZM                   2733   88931.9   22233
> 4       1 2012   4    CZM+LS             2238   88931.9   22233
> 5       1 2012   4    EPO_FLUX_PYRA  2858   88931.9   22233
> 6       1 2012   4    EPO_FLUX_PYRA  3103   88931.9   22233
>
> Can I analyze it as mixed model even with the different sets of
> treatments?
>
> Many thanks!
> Juan
>
> -----Original Message-----
> From: Juan Pablo Edwards Molina [mailto:edwardsmolina at gmail.com]
> Sent: Wednesday, August 09, 2017 02:41
> To: Viechtbauer Wolfgang (SP)
> Cc: r-sig-meta-analysis at r-project.org
> Subject: Re: [R-meta] network meta-analysis - include block (within-study)
> level
>
> Please do not consider my last reply!!
> I mixed everything (because I am also estimating slopes and intercepts
> with the same data set...)
>
> I'm performing a network meta-analysis to estimate the treatments yield
> difference with the untreated check.
>
> this is my data structure,
>
> trt     trial   bk    yield
> Check   3       1     2493
> Check   3       2     2173
> Check   3       3     2628
> Check   3       4     2168
> Fox     3       1     3194
> Fox     3       2     2363
> Fox     3       3     2887
> Fox     3       4     3278
> NTX     3       1     2988
> NTX     3       2     2361
> NTX     3       3     2341
> NTX     3       4     3218
>
> yield = plot grain yield at crop maturity (single value). Actually, plots
> were ~ 15m², however the grain weight was expressed in kg/10000 m² (1ha). 
> bk = are the blcoks within each trial (4 or 5).
> trt = fungicide tratments to reduce a soybean disease.
>
> I estimated yield difference (with the check) by setting Check as
> reference level in the following model:
>
> net1 <- rma.mv(yield_mean, vi2, mods = ~ treatment, random = ~
> treatment| trial,
>                          method="ML", struct="UN", data=df)
>
> where yield_mean is the vector of treatments yield means and vi2 is the
> vector of sampling variances obtained by:
>
> vi2 <- V_yield/n  (for each trial)
>
> (V_yield = MSE from anova)
>
> Since I have the raw full dataset,  I wonder if the correct would be to
> include a block random effect.
>
> Sorry again...
>
> Juan
>  Edwards
>
> On Wed, Aug 9, 2017 at 6:34 AM, Viechtbauer Wolfgang (SP) <
> wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
> So is 'y' is the mean treatment yield here? Also, is that really the
> average of multiple measurements (e.g., if there is subsampling)? Or is 'y'
> just the single measurement (yield) for that particular block and
> treatment? I still do not quite understand what kind of data you have.
> Also, what is 'x'?
>
> Best,
> Wolfgang
>
> -----Original Message-----
> From: Juan Pablo Edwards Molina [mailto:edwardsmolina at gmail.com]
> Sent: Tuesday, August 08, 2017 23:26
> To: Viechtbauer Wolfgang (SP)
> Cc: r-sig-meta-analysis at r-project.org
> Subject: Re: [R-meta] network meta-analysis - include block (within-study)
> level
>
> Pretty close to that structure you say:  I have several treatments at
> each block (balanced experiments), actually different set of treatments
> across the k-trials (all trials have the untreated Check)
>
> This are a few lines of trial 3:
> 
> trt    trial bk  x    y
> Check  3     1   40   2493
> Check  3     2   45   2173
> Check  3     3   40   2628
> Check  3     4   40   2168
> Fox    3     1   35   3194
> Fox    3     2   30   2363
> Fox    3     3   35   2887
> Fox    3     4   30   3278
> NTX    3     1   40   2988
> NTX    3     2   35   2361
> NTX    3     3   35   2341
> NTX    3     4   35   3218
> 
> | Also, do you have the raw mean and variance (or SD) and sample size for
> each row of the dataset? It seems like you are first fitting some kind of
> ANOVA within each study, but | that might actually complicate things.
>
> Yes, I have the raw full dataset so I have the observation level values
> to calculate SD, means..
>
> Several authors from the Phytopathology area use ANOVA MSE :
>
> 
> "...The within-study variance (V) for IND or DON for these fungicide
> trials is the residual variance (mean square error) from an analysis of
> variance (ANOVA) of the effects of treatment on disease or toxin. Where the
> original data were available, this variance was calculated directly from an
> ANOVA..."
>
> http://apsjournals.apsnet.org/doi/abs/10.1094/PHYTO-97-2-0211
>
> Juan
>
> On Tue, Aug 8, 2017 at 6:03 PM, Viechtbauer Wolfgang (SP) <
> wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
> Dear Juan,
>
> Could you show a bit of the data (structure)? In particular, does each
> block contain two treatments, so that the structure looks something like
> this?
>
> trial block treatment mean
> --------------------------
> 1     1     1         ...
> 1     1     2         ...
> 1     2     1         ...
> 1     2     2         ...
> 2     1     1         ...
> 2     1     2         ...
> 2     2     1         ...
> 2     2     2         ...
> 2     3     1         ...
> 2     3     2         ...
> ...
> 
> Also, do you have the raw mean and variance (or SD) and sample size for
> each row of the dataset? It seems like you are first fitting some kind of
> ANOVA within each study, but that might actually complicate things.
>
> Best,
> Wolfgang
>
> -----Original Message-----
> From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-
> bounces at r-project.org] On Behalf Of Juan Pablo Edwards Molina
> Sent: Tuesday, August 08, 2017 22:09
> To: r-sig-meta-analysis at r-project.org
> Subject: [R-meta] network meta-analysis - include block (within-study)
> level
>
> Dear list,
>
> I have a dataset containing crop field randomized block design experiments
> with observations at plot level (experimental unit), and I want to estimate
> the treatments grain yield difference relative to a untreated check.
> 
> net1 <- rma.mv(yield, vi2, mods = ~ treatment, random = ~ treatment|
> trial,
>                          method="ML", struct="UN", data=df)
> 
> where yield is the vector of mean treatments yield for vi2 is the vector of
> sampling variances obtained by:
>
> vi2 <- V_yield/n  (for each trial)
>
> (V_yield = MSE from anova)
>
> Do I need to include the block in the model? or using the experiment
> treatments means will obtain the same results? I suppose something like:
>
> net2 <- rma.mv(yield, vi2, mods = ~ treatment, random = ~ treatment|
> block|
> trial,
>                          method="ML", struct="UN", data=df)
>
> If the latter would be a better approach, how do I include the sampling
> variance?
>
> Thanks in advance,
>
> Juan Edwards
>

	[[alternative HTML version deleted]]