[R-sig-ME] glmms:Checking out over_under_dispersion

C. AMAL D. GLELE @lte@@ed@c2 @ending from gm@il@com
Fri Aug 10 15:12:12 CEST 2018

Dear all,
many thanks for your very useful advices.
Best wishes,

2018-08-10 5:59 GMT+02:00 John Maindonald <john.maindonald using anu.edu.au>:

> Note also the comments of McCullagh and Nelder (2edn, 1999, p.126),
> speaking somewhat disparagingly about the use of beta-binomial
> models as a way to model dispersion, as defined for glm():
> “Though this is an attractive option  from a theoretical standpoint, in
> practice it seems unwise to rely on a specific form of over-dispersion,
> particularly where the assumed form has been chosen for mathematical
> convenience rather than scientific plausibility.”
> At least for data with which I have been working, I beg to disagree!
> The great virtue of glmmTMB::glmmTMB() is that it allows modeling of
> its version of the dispersion parameter (not dispersion as for glm()) as a
> function of explanatory variables.  My experience of using glmmTMB()
> with a several insect mortality datasets from a much larger collection
> was that, consistently, the over-dispersion factor was large at midrange
> mortalities, reducing to close to 1 (i.e., binomial-like) at high
> mortalities.
> There are only 2 datasets that I have access to that I am currently free
> to make public, unfortunately.
> [I suspect that one should somehow be modeling the relevant parameter
> as a function of estimated mortality rather than indirectly as a function
> of explanatory variables.  I’ve wondered whether there is some different
> way to handle the parameterization that would build this in.]
> I’ve not tried modeling the GLM style over-dispersion as a function of
> explanatory variables — there may be some of the software that is
> about that allows this.  My guess is that, as reported in Morgan and
> Ridout (2008) for very different data, the beta-binomial would be
> favoured over a quasi-binomial, with a mixture of the two doing better
> still.
> [A new mixture model for capture heterogeneity. Applied Statistics C.
> https://doi.org/10.1111/j.1467-9876.2008.00620.x]
> See https://maths-people.anu.edu.au/%7Ejohnm/r-book/4edn/
> ch7-BetaBinomial.pdf
> <https://maths-people.anu.edu.au/~johnm/r-book/4edn/ch7-BetaBinomial.pdf>
> for details of what I have done with a dataset that I have permission to
> expose to public view.
> The beta-binomial implies that the variance can never be reduced below
> a lower bound that depends on the dispersion parameter, which I find
> convenient to take for this purpose as the intra-class correlation.  That
> is
> a big difference, if one wants to use results for designing further trials,
> from the story that comes from a quasi-binomial model. I think it more
> likely that the benefits of increasing sample size attenuate as the sample
> size increases, with no variance lower bound.  For the recent data on
> which I had been working, the relevant glmmTMB abilities became
> available too recently (~Jan, 2018) to be applied across all the datasets
> to which I had access.  With what I believe I now know, I’d have had
> the confidence to pursue the use of other packages that can be used,
> with a bit more effort, to achieve a similar result.  Hindsight is a great
> thing.
> Were I in mid-career, I’d likely be pursuing these ideas with some vigour.
> I’d be happy to co-operate with anyone who wants to take them further,
> and might be able to negotiate access to a wider range of datasets than
> I can currently expose to public view.  It surprises me that this seems an
> area that has been very little explored, certainly as it relates to plant
> quarantine research — what has been done to date, including work that
> I did in the 1980s and 1990s, now strikes me as naive.
> John Maindonald             email: john.maindonald using anu.edu.au
> <john.maindonald using anu.edu.au>
> On 10/08/2018, at 14:10, Ben Bolker <bbolker using gmail.com> wrote:
> The standard advice is to compare either the residual deviance or the
> sum of squares of the Pearson residuals to the residual degrees of
> freedom (i.e. (number of observations) - (number of parameters)). This
> is essentially taking the advice for GLMs (see e.g. McCullagh and
> Nelder, or probably any textbook on GLMs) and applying it to GLMMs.
> On Thu, Aug 9, 2018 at 9:24 PM C. AMAL D. GLELE <altessedac2 using gmail.com>
> wrote:
> If, for a given built glmm "mod", I don't want to use an available tool to
> check out (over or under) dispersion, with which variance should I compare
> the total variance explained by mod?
> In advance, thanks for your replies.
> Kind regards,
>        [[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list