[R-sig-ME] Dealing with Overdispersion in Count Data with Mixed Modeling
Ben Bolker
bbolker at gmail.com
Thu Nov 11 20:43:04 CET 2010
On 11/11/2010 02:11 PM, David Stainbrook wrote:
> Ben,
>
> Putting aside the issues of model convergence, I would like to focus on
> the issue of how to deal with overdispersion in count data. Typically, a
> negative binomial or quasi-poisson model deals with overdispersion in
> situations where it is not appropriate to use Poisson. However, to my
> knowledge, lme4 does not allow multiple random effects using the
> negative binomial family
[correction: it doesn't implement negative binomial models at all,
although I believe that *in* principle this could be added by using an
additional 'k' parameter that controlled the mean-variance relationship,
in a way analogous to glm.nb() in the MASS package].
and quasi-poisson has issues with the
> likelihood estimation. As you suggested in a previous email, and from
> the article at
> <http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=82701
> <http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=82701>>,
> translating the model to a lognormal-Poisson model by using an
> individual-level random variable and using family=poisson should deal
> with the overdispersion.
Yes.
>
> I wanted to clarify what you meant by the individual level to make sure
> that I am doing it correctly. I already have a random intercept for each
> individual (1|LT), where LT=lowest tag number for each individual, to
> account for pseudoreplication. For the individual-level random variable
> to deal with overdispersion, did you mean adding (1|Observation), where
> Observation is a vector from 1 to the total number of observations or
> records in the dataset?
yes, exactly.
>
> This is demonstrated here in a very simple model with no covariates:
> model_1<-lmer(Y~1 + (1|LT) + (1|Observation), data=df3, ! family=poisson)
>
> and here with an additional covariate (covx) with both fixed and random
> effects:
> model_2<-lmer(Y~1 + (1+covx|LT) + (1|Observation) + covx, data=df3,
> family=poisson)
>
> Is this what you meant and is it coded correctly?
>
> I tried this code with an older version of R and lme4, and it yielded an
> error and no summary data, but after updating both the version of R and
> the lme4 library, I received a warning "Number of levels of a grouping
> factor for the random effects is *equal* to n, the number of
> observations", but it yielded summary information that I took to be
> legitimate.
That is what I would expect.
Ben Bolker
>
> Thanks for the help,
>
> ..................................................................
>
> David Stainbrook
> M.S. Graduate Research Assistant
> Pennsylvania Cooperative Fish & Wildlife Research Unit
> The Pennsylvania State University
>
> ..................................................................
>
>
> On Mon, Nov 1, 2010 09:53 PM, *Ben Bolker <bbolker at gmail.com>* wrote:
>
> [cc'ing r-sig-mixed-models: it's best to keep sending replies back to
> the list so they can be archived and others can read them, or offer input]
>
> On 10-11-01 01:28 PM, David Stainbrook wrote:
> > Ben,
> >
> > Thanks for your input. I read that article that you suggested and it
> > appears they used SAS and Genstat to do their analysis.
>
> Yes (although as I said at the time, I wouldn't actually trust the
> methods that they used in Genstat for this problem. I just think their
> description of the problem is clear).
>
> > Is it possible
> > to use the Poisson-lognormal model in R or translate the model to this
> > using R and lmer? Another professor mentioned that I may be able to get
> > it to work using a negative binomial model in SAS or ADModel Builder.
> > What do you suggest?
>
> Yes, you can use the Poisson-lognormal in recent versions of lme4,
> simply by including an individual-level random variable. You may get
> warnings.
> You could indeed use a negative binomial model in SAS or AD Model
> Builder (in ADMB you could also use the lognormal-Poisson model).
>
> > Do you have any idea why Doug allowed the lmer function to fit
> > quasipoisson if he doesn't feel that the results will be reliable? I
> > would have trusted my results and wouldn't have had any idea that they
> > might have been unreliable if he had not said that.
>
> I believe he implemented it a while ago and his opinions have now
> changed. (I agree that it might be a good idea to disable this
> functionality.)
>
> > Also, do you have any idea how to increase both the default number of
> > function evaluations and iterations with the control statement within
> > the lmer model statement?
>
> ...,control=list(maxIter=2000,maxFN=3000),... should work.
>
> * you seem to be tackling a difficult problem. I appreciate that
> you're offering full details on your problem (full scripts and data),
> but it's going to take someone else at least half an hour (and probably
> quite a bit more) to get up to speed on what you're doing and what's not
> working; unfortunately, that's more than most anyone has time for,
> unless the problem happens to be something very close to their
> interests. Unfortunately, you may well need to find local help for this
> (your advisor? a friendly stats professor or graduate student?) - <I already exhausted those options, hence contacting you and Doug>
> * it's possible, depending on the complexity of your model, that
> you're simply trying to fit too complicated a model. You do have a lot
> of data points, but some of your covariates may be strongly correlated.
> Have you tried:
> - seeing if you can successfully fit a subset of the data points
> (this could be faster, allowing you to debug quicker)?
> - seeing if you can successfully fit a subset of the covariates, or
> which covariates or combinations of covariates are problematic?
> - seeing if you can successfully fit a non-mixed (GLM) model,
> treating 'individual' as a fixed effect?
> - simulating data, possibly in a simplified form, to see if you can
> get the right answer when you know what it is?
> * lme4 is quite finicky about convergence, on the philosophy that it's
> better not to give an answer than to give a wrong one.
>
> R does have its advantages, but if you're up to working with SAS or AD
> Model Builder I would recommend you also try those approaches -- see if
> you run into the same problems. But I would definitely try some of the
> trouble-shooting strategies above, first.
>
> good luck,
> Ben Bolker
>
>
> > Thanks again,
> >
> > David
>
> >
> > On Thu, Oct 28, 2010 04:49 PM, *Ben Bolker <bbolker at gmail.com>*
> wrote:
> >
> > My advice would be to use an individual-level random variable
> > (translating to a lognormal-Poisson model, which is qualitatively
> > similar to a negative binomial) -- see e.g. Elston et al 2001 for
> a
> > decent explanation, although you should not necessarily trust the
> > numeric methods they use ...
> >
> > [Elston, D. A., R. Moss, T. Boulinier, C. Arrowsmith, and X. Lambin.
> > 2001. Analysis of Aggregation, a Worked Example: Numbers of Ticks on
> Red
> > Grouse Chicks. Parasitology 122, no. 05: 563-569.
> > doi:10.1017/S0031182001007740.
> >
> http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=82701.]
> >
> >
> > (Doug, thanks for the vote of confidence!)
> >
> > cheers
> > Ben
> >
> >
> >
> >
> >
>
>
>
>
More information about the R-sig-mixed-models
mailing list