[R-sig-ME] GLMM for Combined experiments and overdispersed data

Tue Apr 25 16:29:12 CEST 2017

Perhaps I’m missing something.

Are the individual trees the experimental units, and are you only taking one count per tree, or are there multiple counts? That makes me think that that tree_id variance is effectively residual variance and that there is no random effect other than residual.

You have 112 df.resid, and 120 groups : tree_id.

So what do you gain by using glmer as opposed to gem with a quasibinomial family?

Peter

> On Apr 25, 2017, at 8:40 AM, Juan Pablo Edwards Molina <edwardsmolina at gmail.com> wrote:
> 
> Thierry, sorry to bother you again...
> I tried to follow your suggestion and I did the herlmert contrasts with lsmeans package.
> 
> dinc <- within(dinc, { tree_id <- as.factor(interaction(farm, trt, bk, tree)) })
> 
> resp1 <- with(dinc, cbind(dis, tot-dis)) 
> 
> m0 = glmer(resp1 ~ trt + farm + (1|tree_id), family = binomial, data=dinc) 
> 
> > summary(m0)
> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [
> glmerMod]
>  Family: binomial  ( logit )
> Formula: resp1 ~ trt + farm + (1 | tree_id)
>    Data: dinc
> 
>      AIC      BIC   logLik deviance df.resid 
>    521.5    543.8   -252.7    505.5      112 
> 
> Scaled residuals: 
>      Min       1Q   Median       3Q      Max 
> -0.90445 -0.51114 -0.00572  0.31456  1.04667 
> 
> Random effects:
>  Groups  Name        Variance Std.Dev.
>  tree_id (Intercept) 1.028    1.014   
> Number of obs: 120, groups:  tree_id, 120
> 
> Fixed effects:
>             Estimate Std. Error z value Pr(>|z|)    
> (Intercept) -4.87786    0.37604 -12.972  < 2e-16 ***
> trtG10      -0.06738    0.49125  -0.137  0.89090    
> trtG15       0.90620    0.44435   2.039  0.04141 *  
> trtG20       1.13733    0.43920   2.590  0.00961 ** 
> trtControl   5.10202    0.41215  12.379  < 2e-16 ***
> farmstacruz -0.80155    0.30294  -2.646  0.00815 ** 
> farmtaqua   -0.84738    0.30659  -2.764  0.00571 ** 
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Correlation of Fixed Effects:
>             (Intr) trtG10 trtG15 trtG20 trtCnt frmstc
> trtG10      -0.635                                   
> trtG15      -0.710  0.538                            
> trtG20      -0.720  0.546  0.604                     
> trtControl  -0.763  0.571  0.650  0.651              
> farmstacruz -0.300  0.022 -0.015  0.010 -0.081       
> farmtaqua   -0.301  0.012  0.004  0.017 -0.083  0.441
> 
> ### Setting up a custom contrast function
> 
> helmert.lsmc <- function(levs, ...) {
>   M <- as.data.frame(contr.helmert(levs))
>   names(M) <- paste(levs[-1],"vs calendar")
>   attr(M, "desc") <- "Helmert contrasts"
>   M
> }
> 
> > lsmeans(m0, helmert ~ trt, type = "response")
> 
> $lsmeans
>  trt             prob          SE df   asymp.LCL   asymp.UCL
>  Calendar 0.004374833 0.001541432 NA 0.002191201 0.008715549
>  G10      0.004090935 0.001498922 NA 0.001993307 0.008377443
>  G15      0.010757825 0.003091538 NA 0.006116214 0.018855141
>  G20      0.013517346 0.003832360 NA 0.007740919 0.023502164
>  Control  0.419339074 0.051564118 NA 0.322885163 0.522377507
> 
> Results are averaged over the levels of: farm 
> Confidence level used: 0.95 
> Intervals are back-transformed from the logit scale 
> 
> $contrasts
>  contrast              odds.ratio           SE df z.ratio p.value
>  G10 vs calendar     9.348400e-01 4.592373e-01 NA  -0.137  0.8909
>  G15 vs calendar     6.552019e+00 4.909975e+00 NA   2.508  0.0121
>  G20 vs calendar     1.310737e+01 1.307704e+01 NA   2.579  0.0099
>  Control vs calendar 1.011296e+08 1.124089e+08 NA  16.582  <.0001
> 
> Results are averaged over the levels of: farm 
> Tests are performed on the log odds ratio scale 
> 
> ## Do you think it's correct, if I consider trt calendar as the reference to test my other treatments?
> 
> Thanks!
> 
> Juan
> 
> Juan
> 
> On Mon, Apr 24, 2017 at 11:46 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>> wrote:
> Dear Peter,
> 
> Both models will yield identical results in case tree_id uses unique codes over the blocks.
> 
> Best regards,
> 
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest 
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance 
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
> 
> To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner 
> The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
> 
> 2017-04-24 15:55 GMT+02:00 Peter Claussen <dakotajudo at mac.com <mailto:dakotajudo at mac.com>>:
> Juan,
> 
> I would model this as
> 
> m3 = glmer(resp ~ trt * farm +  (1| bk/tree), family = binomial, data=df)
> or
> m3 = glmer(resp ~ trt * farm +  (1| bk) +  (1| tree_id), family = binomial, data=df)
> (I can’t say off the top of my head if what the difference would be if you’re dealing with over-dispersion).
> 
> 1. I’m assuming that block is a somewhat uniform grouping of trees, so that including block in the model gives you an estimate of spatial variability in the response, and if that is important relative to tree-to-tree variation.
> 
> 2. You will most certainly want to include trt*farm to test for treatment-by-environment interaction. If interaction is not significant, you may choose to exclude interaction from the model. If there is interaction, then you will want to examine each farm to determine if cross-over interaction present.
> 
> If your experiment is to determine the “best” fungicide spraying system, and cross-over interaction is present, then you have no “best” system. You might have cross-over arising because, say, system 1 ranks “best” on farm 1, but system 2 ranks “best” on farm 2.
> 
> There is extensive literature on the topic, mostly from the plant breeding genotype-by-environment interaction side. Some of the associated statistics implemented in the agricolae package, i.e. AMMI.
> 
> Peter
> 
> > On Apr 24, 2017, at 6:56 AM, Juan Pablo Edwards Molina <edwardsmolina at gmail.com <mailto:edwardsmolina at gmail.com>> wrote:
> >
> > I´m sorry... I´m new in the list, and when I figured out that the question
> > would suit best in the mixed model list I had already post it in general
> > R-help. I don´t know if there´s a way to "cancel a question"... I will take
> > care of it from now on.
> >
> > Dear Thierry, thanks for your answer.
> > Yes, I am not interested in the effect of a specific farm, they simply
> > represent the total of farms from the region where I want to suggest the
> > best treatments.
> >
> > I Followed your suggestions, but still have a couple of doubts,
> >
> > 1- May "farm" be include as a simple fixed effect or interacting with the
> > treatment?
> >
> > m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df)
> > m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df)
> >
> > 2 - 
> > In case of significant
> > [ trt * farm ], should I report the results for each farm?
> >
> > Thanks again Thierry,
> >
> > Juan Edwards
> >
> >
> > *Juan*
> >
> > On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>>
> > wrote:
> >
> >> Dear Juan,
> >>
> >> Use unique id's for random effects variables. So each bk should only be
> >> present in one farm. And each tree_id should be present in only one bk. In
> >> case each block has different treatments then each tree_id should be unique
> >> to one combination of bk and trt.
> >>
> >> Farm has too few levels to be a random effects. So either model is as a
> >> fixed effect or drop it. In case you drop it, the information will be
> >> picked up by bk. Note that trt + (1|farm) is less complex than trt * farm.
> >>
> >> Assuming that you are not interested in the effect of a specific farm, you
> >> could use sum, polynomial or helmert contrasts for the farms. Unlike the
> >> default treatment contrast, these type of contrasts sum to zero. Thus the
> >> effect of trt will be that for the average farm instead of the reference
> >> farm.
> >>
> >> Best regards,
> >>
> >> ir. Thierry Onkelinx
> >> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> >> Forest
> >> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> >> Kliniekstraat 25
> >> 1070 Anderlecht
> >> Belgium
> >>
> >> To call in the statistician after the experiment is done may be no more
> >> than asking him to perform a post-mortem examination: he may be able to say
> >> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> >> The plural of anecdote is not data. ~ Roger Brinner
> >> The combination of some data and an aching desire for an answer does not
> >> ensure that a reasonable answer can be extracted from a given body of data.
> >> ~ John Tukey
> >>
> >> 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina <
> >> edwardsmolina at gmail.com <mailto:edwardsmolina at gmail.com>>:
> >>
> >>> I am analyzing data from 3 field experiments (farms=3) for a citrus flower
> >>> disease: response variable is binomial because the flower can only be
> >>> diseased or healthy.
> >>>
> >>> I have particular interest in comparing 5 fungicide spraying systems
> >>> (trt=5).
> >>>
> >>> Each farm had 4 blocks (bk=4) including 2 trees as subsamples (tree=2) in
> >>> which I assessed 100 flowers each one. This is a quick look of the data:
> >>>
> >>> farm      trt      bk    tree   dis   tot     <fctr>   <fctr>  <fctr>
> >>> <fctr> <int> <int>
> >>> iaras      cal      1      1     0    100
> >>> iaras      cal      1      2     1    100
> >>> iaras      cal      2      1     1    100
> >>> iaras      cal      2      2     3    100
> >>> iaras      cal      3      1     0    100
> >>> iaras      cal      3      2     5    100...
> >>>
> >>> The model I considered was:
> >>>
> >>> resp <- with(df, cbind(dis, tot-dis))
> >>>
> >>> m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df)
> >>>
> >>> I tested the overdispersion with the overdisp_fun() from GLMM page
> >>> <http://glmm.wikidot.com/faq <http://glmm.wikidot.com/faq>>
> >>>
> >>>        chisq         ratio             p          logp
> >>> 4.191645e+02  3.742540e+00  4.804126e-37 -8.362617e+01
> >>>
> >>> As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I
> >>> considered to add the observation level random effect (link
> >>> <http://r.789695.n4.nabble.com/Question-on-overdispersion-td3049898.html <http://r.789695.n4.nabble.com/Question-on-overdispersion-td3049898.html>
> >>>> )
> >>> to deal with the overdispersion.
> >>>
> >>> farm      trt      bk    tree   dis   tot tree_id    <fctr>   <fctr>
> >>> <fctr> <fctr> <int> <int> <fctr>
> >>> iaras      cal      1      1     0    100    1
> >>> iaras      cal      1      2     1    100    2
> >>> iaras      cal      2      1     1    100    3...
> >>>
> >>> so now was added a random effect for each row (tree_id) to the model, but
> >>> I
> >>> am not sure of how to include it. This is my approach:
> >>>
> >>> m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial,
> >>> data=df)
> >>>
> >>> I also wonder if farm should be a fixed effect, since it has only 3
> >>> levels...
> >>>
> >>> m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family =
> >>> binomial, data=df)
> >>>
> >>> I really appreciate your suggestions about my model specifications...
> >>>
> >>>
> >>>
> >>> *Juan Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD student
> >>> - ESALQ-USP/Brazil*
> >>>
> >>>        [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> R-sig-mixed-models at r-project.org <mailto:R-sig-mixed-models at r-project.org> mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> >>
> >>
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org <mailto:R-sig-mixed-models at r-project.org> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org <mailto:R-sig-mixed-models at r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> 

	[[alternative HTML version deleted]]