[R-sig-ME] overdispersion and the one random effect per observation approach

Sat Jun 26 18:50:08 CEST 2010

  @article{elston_analysis_2001,
        title = {Analysis of aggregation, a worked example: numbers of
ticks on red grouse chicks},
        volume = {122},
        number = {5},
        journal = {Parasitology},
        author = {D. A. Elston and R. Moss and T. Boulinier and C.
Arrowsmith and X. Lambin},
        year = {2001},
        pages = {563--569}
}

 ... although having looked at this particular example more carefully I
think I might *not* recommend the particular approach they took (i.e.,
using PQL with per-individual random effects, which is explicitly
disrecommended in the Genstat/AS-REML documentation ...)

  Try searching for "Poisson-lognormal distribution" too ...

Luca Borger wrote:
> Dear All,
> 
> it has been recently discussed on this list (e.g. see below, as well as 
> http://glmm.wikidot.com/faq) that overdispersed distributions can be 
> modelled by using an observation-level random effect (i.e. one random effect 
> per observation). I am wondering if anyone knows a good reference for this 
> approach. John Maindonald kindly pointed me to an example in the new edition 
> of his book:
> 
>> There is an example in Section 10.5 of the 3rd edition of Data Analysis & 
>> Graphics Using R, which is just now out.
> 
> Does anyone know other refs? Thanks in advance for your help!
> 
> 
> Cheers,
> 
> Luca
> 
> 
> -------------------
> Luca Börger, PhD
> Postdoctoral Research Fellow
> Department of Integrative Biology
> University of Guelph
> Guelph, Ontario, Canada N1G 2W1
> 
> office +1 519 824 4120 ext. 52975
> lab     +1 519 824 4120 ext. 53594
> fax:     +1 519 767 1656
> 
> email: lborger at uoguelph.ca
> www.researcherid.com/rid/C-6003-2008
> http://uoguelph.academia.edu/LucaBorger
> --------------------------------------------------------------------
> 
> 
> 
> 
>> ----- Original Message ----- From: "John Maindonald" 
>> <john.maindonald at anu.edu.au>
>> To: <r-sig-mixed-models at r-project.org>
>> Sent: Thursday, June 24, 2010 7:11 PM
>> Subject: [R-sig-ME] Fwd: lme4, lme4a,and overdispersed distributions 
>> (again)
>>
>>
>>> I think it more accurate to say that, in general, there may be
>>> a class of distributions, and therefore a possible multiplicity
>>> of likelihoods, not necessarily for distributions of exponential
>>> form.  This is a PhD thesis asking to be done, or maybe
>>> someone has already done it.
>>>
>>> Over-dispersed distributions, where it is entirely clear what the
>>> distribution is, can be generated as GLM model +  one random
>>> effect per observation.  We have discussed this before.  This
>>> seems to me the preferred way to go, if such a model seems to
>>> fit the data.  I've not checked the current state of play re fitting
>>> such models in lme4 of lme4a; in the past some versions have
>>> allowed such a model.
>>>
>>> I like the simplicity of the one random effect per observation
>>> approach, as against what can seem the convoluted theoretical
>>> framework in which beta binomials live.
>>>
>>> John Maindonald             email: john.maindonald at anu.edu.au
>>> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
>>> Centre for Mathematics & Its Applications, Room 1194,
>>> John Dedman Mathematical Sciences Building (Building 27)
>>> Australian National University, Canberra ACT 0200.
>>> http://www.maths.anu.edu.au/~johnm
>>>
>>>> On 25/06/2010, at 3:59 AM, Jeffrey Evans wrote:
>>>>
>>>>> Since I am definitely *not* a mathematician, I am straying in over my 
>>>>> head
>>>>> here.
>>>>>
>>>>> I understand what you are saying - that there isn't a likelihood 
>>>>> function
>>>>> for the quasi-binomial "distribution". And therefore, there is no-such
>>>>> distribution.
>>>>>
>>>>> What do you think of the suggestion that a beta-binomial mixture
>>>>> distribution could be used to model overdispersed binomial data?
>>>>>
>>>>> Would this be a techinically correct and logistically feasibile 
>>>>> solution?
>>>>>
>>>>> -jeff
>>>>>
>>>>> -----Original Message-----
>>>>> From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
>>>>> Bates
>>>>> Sent: Thursday, June 24, 2010 1:25 PM
>>>>> To: Jeffrey Evans
>>>>> Cc: r-sig-mixed-models at r-project.org
>>>>> Subject: Re: [R-sig-ME] lme4, lme4a, and overdispersed distributions 
>>>>> (again)
>>>>>
>>>>> On Thu, Jun 24, 2010 at 11:54 AM, Jeffrey Evans
>>>>> <Jeffrey.Evans at dartmouth.edu> wrote:
>>>>>> Like others, I have experienced trouble with estimation of the scale
>>>>>> parameter using the quasi-distributions in lme4, which is necessary to
>>>>>> calculate QAICc and rank overdispersed generalized linear mixed 
>>>>>> models.
>>>>>> I had several exchanges with Ben Bolker about this early last year
>>>>>> after his TREE paper came out
>>>>>> (http://www.cell.com/trends/ecology-evolution/abstract/S0169-5347%2809
>>>>>> %29000 19-6), and I know it's been discussed on on this list. Has
>>>>>> there been or is there any potential resolution to this forthcoming in
>>>>>> future releases of
>>>>>> lme4 or lme4a? I run into overdispersed binomial distributions
>>>>>> frequently and have had to use SAS to deal with them. SAS appears to
>>>>>> work, but it won't estimate the overdispersion parameter using laplace
>>>>>> estimation (only PQL), As I understand it, these pseudo-Iikelihoods
>>>>>> can't be used for model ranking. I don't know why SAS can't/won't, but
>>>>>> lme4 will run these quasi-binomial and quasi-poisson distributions 
>>>>>> with
>>>>> Laplace estimation.
>>>>>
>>>>>> Is there a workable way to use lme4 for modeling overdispersed
>>>>>> binomial data?
>>>>> I have trouble discussing this because I come from a background as a
>>>>> mathematician and am used to tracing derivations back to the original
>>>>> definitions.  So when I think of a likelihood (or, equivalently, a
>>>>> deviance) to be optimized it only makes sense to me if there is a
>>>>> probability distribution associated with the model.  And for the
>>>>> quasi-binomial and quasi-Poisson families, there isn't a probability
>>>>> distribution.  To me that means that discussing maximum likelihood
>>>>> estimators for such models is nonsense.  The models simply do not 
>>>>> exist.
>>>>> One can play tricks in the case of a generalized linear model to 
>>>>> estimate a
>>>>> "quasi-parameter" that isn't part of the probability distribution but 
>>>>> it is
>>>>> foolhardy to expect that the tricks will automatically carry over to a
>>>>> generalized linear mixed model.
>>>>>
>>>>> I am not denying that data that are over-dispersed with respect to the
>>>>> binomial or Poisson distributions can and do occur.  But having data 
>>>>> like
>>>>> this and a desire to model it doesn't make the quasi families real.  In 
>>>>> his
>>>>> signature Thierry Onkelinx quotes
>>>>>
>>>>> The combination of some data and an aching desire for an answer does 
>>>>> not
>>>>> ensure that a reasonable answer can be extracted from a given body of 
>>>>> data.
>>>>> ~ John Tukey
>>>>>
>>>>> I could and do plan to incorporate the negative binomial family but, 
>>>>> without
>>>>> a definition that I can understand of a quasi-binomial or quasi-Poisson
>>>>> distribution and its associated probability function, I'm stuck. To me 
>>>>> it's
>>>>> a "build bricks without straw" situation - you can't find maximum 
>>>>> likelihood
>>>>> estimates for parameters that aren't part of the likelihood.
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> John Maindonald             email: john.maindonald at anu.edu.au
>>> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
>>> Centre for Mathematics & Its Applications, Room 1194,
>>> John Dedman Mathematical Sciences Building (Building 27)
>>> Australian National University, Canberra ACT 0200.
>>> http://www.maths.anu.edu.au/~johnm
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
*** NEW E-MAIL ADDRESSES:
***   bbolker at gmail.com , bolker at math.mcmaster.ca
bolker at ufl.edu / people.biology.ufl.edu/bolker
GPG key: people.biology.ufl.edu/bolker/benbolker-publickey.asc