[R] When to use quasipoisson instead of poisson family

Tue Apr 10 11:39:47 CEST 2007

On Tue, 10 Apr 2007, ronggui wrote:

> On 4/10/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
>> On Tue, 10 Apr 2007, Achim Zeileis wrote:
>> 
>> > On Tue, 10 Apr 2007, ronggui wrote:
>> >
>> >> It seems that MASS suggest to judge on the basis of
>> >> sum(residuals(mode,type="pearson"))/df.residual(mode).
>> 
>> Not really; that is the conventional moment estimator of overdispersion
>> and it does not suffer from the severe biases the unreferenced estimate
>> below has (and are illustrated in MASS).
>> 
>> >> My question: Is
>> >> there any rule of thumb of the cutpoiont value?
>> >>
>> >> The paper "On the Use of Corrections for Overdispersion"
>> 
>> Whose paper?  It is churlish not to give credit, and unhelpful to your
>> readers not to give a proper citation.
>
> Thanks for pointing this out. There is the citation:
> @article{lindsey1999,
> title={On the use of corrections for overdispersion},
> author={Lindsey, JK},
> journal={Applied Statistics},
> volume={48},
> number={4},
> pages={553--561},
> year={1999},
> }

And I can add 'it is helpful to know whose authority is being invoked', 
since (e.g.) some authors are not at all careful.

>> >> suggests overdispersion exists if the deviance is at least twice the
>> >> number of degrees of freedom.
>> 
>> Overdispersion _exists_:  'all models are wrong but some are useful'
>> (G.E.P. Box).  The question is if it is important in your problem, not it
>> if is detectable.
>
>
>> > There are also formal tests for over-dispersion. I've implemented one 
>> for
>> > a package which is not yet on CRAN (code/docs attached), another one is
>> > implemented in odTest() in package "pscl". The latter also contains
>> > further count data regression models which can deal with both
>> > over-dispersion and excess zeros in count data. A vignette explaining 
>> the
>> > tools is about to be released.
>> 
>> There are, but like formal tests for outliers I would not advise using
>> them, as you may get misleading inferences before they are significant,
>> and they can reject when the inferences from the small model are perfectly
>> adequate.
>> 
>> In general, it is a much better idea to expand your models to take account
>> of the sorts of departures your anticipate rather than post-hoc test for
>> those departures and then if those tests do not fail hope that there is
>> little effect on your inferences.
>
> Which is the better (or ) best way to expand the existing model?
> by adding some other relevant independent variables or by using other
> more suitable model like "Negative Binomial Generalized Linear Model"?

I can think of several approaches to lack of fit of a Poisson GLM

1) Missing explanatory variables.  If you have them, of course use them.

2) Correlated observations, probably in groups.  Here a GEE model can be 
appropriate.

3) Missing group random effects.  Similar in effect to 2) (and the random 
effects can even be correlated), but with a different interpretation.

4) Missing random effects at individual level, hence count observations 
are from a Poisson mixture such as a negative binomial.

5) Correlation at individual level, hence count observations are from a 
sum of correlated Poissons.

6) Use moment inference, e.g. quasi models such as quasipoisson.  This 
does not care about the causal mechanism but adjusts for the some of the 
effects of some unknown mechanisms.  (GEE is similar.)

You may or may not be able to tell these apart depending on your design.
However, it more often boils down to 'the sort of departures you 
anticipate' in choosing what to do.

>
> Thanks!
>
>> The moment estimator \phi of over-dispersion gives you an indication of
>> the likely effects on your inferences: e.g. estimated standard errors are
>> proportional to \sqrt(\phi).  Having standard errors which need inflating
>> by 40% seems to indicate that the rule you quote is too optimistic (even
>> when its estimator is reliable).
>> 
>> --
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>> 
>
>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595