[R-sig-ME] GLMM for Combined experiments and overdispersed data

Tue Apr 25 15:40:03 CEST 2017

Thierry, sorry to bother you again...
I tried to follow your suggestion and I did the herlmert contrasts with
lsmeans package.

dinc <- within(dinc, { tree_id <- as.factor(interaction(farm, trt, bk,
tree)) })

resp1 <- with(dinc, cbind(dis, tot-dis))

m0 = glmer(resp1 ~ trt + farm + (1|tree_id), family = binomial, data=dinc)

> summary(m0)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [
glmerMod]
 Family: binomial  ( logit )
Formula: resp1 ~ trt + farm + (1 | tree_id)
   Data: dinc

     AIC      BIC   logLik deviance df.resid
   521.5    543.8   -252.7    505.5      112

Scaled residuals:
     Min       1Q   Median       3Q      Max
-0.90445 -0.51114 -0.00572  0.31456  1.04667

Random effects:
 Groups  Name        Variance Std.Dev.
 tree_id (Intercept) 1.028    1.014
Number of obs: 120, groups:  tree_id, 120

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.87786    0.37604 -12.972  < 2e-16 ***
trtG10      -0.06738    0.49125  -0.137  0.89090
trtG15       0.90620    0.44435   2.039  0.04141 *
trtG20       1.13733    0.43920   2.590  0.00961 **
trtControl   5.10202    0.41215  12.379  < 2e-16 ***
farmstacruz -0.80155    0.30294  -2.646  0.00815 **
farmtaqua   -0.84738    0.30659  -2.764  0.00571 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) trtG10 trtG15 trtG20 trtCnt frmstc
trtG10      -0.635
trtG15      -0.710  0.538
trtG20      -0.720  0.546  0.604
trtControl  -0.763  0.571  0.650  0.651
farmstacruz -0.300  0.022 -0.015  0.010 -0.081
farmtaqua   -0.301  0.012  0.004  0.017 -0.083  0.441

### Setting up a custom contrast function

helmert.lsmc <- function(levs, ...) {
  M <- as.data.frame(contr.helmert(levs))
  names(M) <- paste(levs[-1],"vs calendar")
  attr(M, "desc") <- "Helmert contrasts"
  M
}

> lsmeans(m0, helmert ~ trt, type = "response")

$lsmeans
 trt             prob          SE df   asymp.LCL   asymp.UCL
 Calendar 0.004374833 0.001541432 NA 0.002191201 0.008715549
 G10      0.004090935 0.001498922 NA 0.001993307 0.008377443
 G15      0.010757825 0.003091538 NA 0.006116214 0.018855141
 G20      0.013517346 0.003832360 NA 0.007740919 0.023502164
 Control  0.419339074 0.051564118 NA 0.322885163 0.522377507

Results are averaged over the levels of: farm
Confidence level used: 0.95
Intervals are back-transformed from the logit scale

$contrasts
 contrast              odds.ratio           SE df z.ratio p.value
 G10 vs calendar     9.348400e-01 4.592373e-01 NA  -0.137  0.8909
 G15 vs calendar     6.552019e+00 4.909975e+00 NA   2.508  0.0121
 G20 vs calendar     1.310737e+01 1.307704e+01 NA   2.579  0.0099
 Control vs calendar 1.011296e+08 1.124089e+08 NA  16.582  <.0001

Results are averaged over the levels of: farm
Tests are performed on the log odds ratio scale

## Do you think it's correct, if I consider trt calendar as the reference
to test my other treatments?

Thanks!

Juan

*Juan*

On Mon, Apr 24, 2017 at 11:46 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be
> wrote:

> Dear Peter,
>
> Both models will yield identical results in case tree_id uses unique codes
> over the blocks.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2017-04-24 15:55 GMT+02:00 Peter Claussen <dakotajudo at mac.com>:
>
>> Juan,
>>
>> I would model this as
>>
>> m3 = glmer(resp ~ trt * farm +  (1| bk/tree), family = binomial, data=df)
>> or
>> m3 = glmer(resp ~ trt * farm +  (1| bk) +  (1| tree_id), family =
>> binomial, data=df)
>> (I can’t say off the top of my head if what the difference would be if
>> you’re dealing with over-dispersion).
>>
>> 1. I’m assuming that block is a somewhat uniform grouping of trees, so
>> that including block in the model gives you an estimate of spatial
>> variability in the response, and if that is important relative to
>> tree-to-tree variation.
>>
>> 2. You will most certainly want to include trt*farm to test for
>> treatment-by-environment interaction. If interaction is not significant,
>> you may choose to exclude interaction from the model. If there is
>> interaction, then you will want to examine each farm to determine if
>> cross-over interaction present.
>>
>> If your experiment is to determine the “best” fungicide spraying system,
>> and cross-over interaction is present, then you have no “best” system. You
>> might have cross-over arising because, say, system 1 ranks “best” on farm
>> 1, but system 2 ranks “best” on farm 2.
>>
>> There is extensive literature on the topic, mostly from the plant
>> breeding genotype-by-environment interaction side. Some of the associated
>> statistics implemented in the agricolae package, i.e. AMMI.
>>
>> Peter
>>
>> > On Apr 24, 2017, at 6:56 AM, Juan Pablo Edwards Molina <
>> edwardsmolina at gmail.com> wrote:
>> >
>> > I´m sorry... I´m new in the list, and when I figured out that the
>> question
>> > would suit best in the mixed model list I had already post it in general
>> > R-help. I don´t know if there´s a way to "cancel a question"... I will
>> take
>> > care of it from now on.
>> >
>> > Dear Thierry, thanks for your answer.
>> > Yes, I am not interested in the effect of a specific farm, they simply
>> > represent the total of farms from the region where I want to suggest the
>> > best treatments.
>> >
>> > I Followed your suggestions, but still have a couple of doubts,
>> >
>> > 1- May "farm" be include as a simple fixed effect or interacting with
>> the
>> > treatment?
>> >
>> > m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df)
>> > m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df)
>> >
>> > 2 - 
>> > In case of significant
>> > [ trt * farm ], should I report the results for each farm?
>> >
>> > Thanks again Thierry,
>> >
>> > Juan Edwards
>> >
>> >
>> > *Juan*
>> >
>> > On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <
>> thierry.onkelinx at inbo.be>
>> > wrote:
>> >
>> >> Dear Juan,
>> >>
>> >> Use unique id's for random effects variables. So each bk should only be
>> >> present in one farm. And each tree_id should be present in only one
>> bk. In
>> >> case each block has different treatments then each tree_id should be
>> unique
>> >> to one combination of bk and trt.
>> >>
>> >> Farm has too few levels to be a random effects. So either model is as a
>> >> fixed effect or drop it. In case you drop it, the information will be
>> >> picked up by bk. Note that trt + (1|farm) is less complex than trt *
>> farm.
>> >>
>> >> Assuming that you are not interested in the effect of a specific farm,
>> you
>> >> could use sum, polynomial or helmert contrasts for the farms. Unlike
>> the
>> >> default treatment contrast, these type of contrasts sum to zero. Thus
>> the
>> >> effect of trt will be that for the average farm instead of the
>> reference
>> >> farm.
>> >>
>> >> Best regards,
>> >>
>> >> ir. Thierry Onkelinx
>> >> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and
>> >> Forest
>> >> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>> >> Kliniekstraat 25
>> >> 1070 Anderlecht
>> >> Belgium
>> >>
>> >> To call in the statistician after the experiment is done may be no more
>> >> than asking him to perform a post-mortem examination: he may be able
>> to say
>> >> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> >> The plural of anecdote is not data. ~ Roger Brinner
>> >> The combination of some data and an aching desire for an answer does
>> not
>> >> ensure that a reasonable answer can be extracted from a given body of
>> data.
>> >> ~ John Tukey
>> >>
>> >> 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina <
>> >> edwardsmolina at gmail.com>:
>> >>
>> >>> I am analyzing data from 3 field experiments (farms=3) for a citrus
>> flower
>> >>> disease: response variable is binomial because the flower can only be
>> >>> diseased or healthy.
>> >>>
>> >>> I have particular interest in comparing 5 fungicide spraying systems
>> >>> (trt=5).
>> >>>
>> >>> Each farm had 4 blocks (bk=4) including 2 trees as subsamples
>> (tree=2) in
>> >>> which I assessed 100 flowers each one. This is a quick look of the
>> data:
>> >>>
>> >>> farm      trt      bk    tree   dis   tot     <fctr>   <fctr>  <fctr>
>> >>> <fctr> <int> <int>
>> >>> iaras      cal      1      1     0    100
>> >>> iaras      cal      1      2     1    100
>> >>> iaras      cal      2      1     1    100
>> >>> iaras      cal      2      2     3    100
>> >>> iaras      cal      3      1     0    100
>> >>> iaras      cal      3      2     5    100...
>> >>>
>> >>> The model I considered was:
>> >>>
>> >>> resp <- with(df, cbind(dis, tot-dis))
>> >>>
>> >>> m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df)
>> >>>
>> >>> I tested the overdispersion with the overdisp_fun() from GLMM page
>> >>> <http://glmm.wikidot.com/faq>
>> >>>
>> >>>        chisq         ratio             p          logp
>> >>> 4.191645e+02  3.742540e+00  4.804126e-37 -8.362617e+01
>> >>>
>> >>> As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I
>> >>> considered to add the observation level random effect (link
>> >>> <http://r.789695.n4.nabble.com/Question-on-overdispersion-
>> td3049898.html
>> >>>> )
>> >>> to deal with the overdispersion.
>> >>>
>> >>> farm      trt      bk    tree   dis   tot tree_id    <fctr>   <fctr>
>> >>> <fctr> <fctr> <int> <int> <fctr>
>> >>> iaras      cal      1      1     0    100    1
>> >>> iaras      cal      1      2     1    100    2
>> >>> iaras      cal      2      1     1    100    3...
>> >>>
>> >>> so now was added a random effect for each row (tree_id) to the model,
>> but
>> >>> I
>> >>> am not sure of how to include it. This is my approach:
>> >>>
>> >>> m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial,
>> >>> data=df)
>> >>>
>> >>> I also wonder if farm should be a fixed effect, since it has only 3
>> >>> levels...
>> >>>
>> >>> m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family =
>> >>> binomial, data=df)
>> >>>
>> >>> I really appreciate your suggestions about my model specifications...
>> >>>
>> >>>
>> >>>
>> >>> *Juan Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD
>> student
>> >>> - ESALQ-USP/Brazil*
>> >>>
>> >>>        [[alternative HTML version deleted]]
>> >>>
>> >>> _______________________________________________
>> >>> R-sig-mixed-models at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >>
>> >>
>> >>
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > R-sig-mixed-models at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>

	[[alternative HTML version deleted]]