[R-sig-ME] Poisson or Gaussian when modelling count data + heteroscedasticity in predictor variables

Ben Bolker bbolker at gmail.com
Mon Nov 7 17:36:00 CET 2016

  Do you know if you have over- or underdispersion?  For species
richness, if there is a relatively restricted regional species pool,
species richness can actually be *underdispersed*, which would also make
Poisson and negative binomial be poor fits.

  Generally, heteroscedasticity is a bigger problem than Normality
(sorry, don't have a reference handy), so if you can deal with the
heteroscedasticity effectively with a linear model, I would say go for it.

On 16-11-07 11:28 AM, Luciana Motta wrote:
> Thank you Tom.
> My richness data go from 5 to 20, don't think that applies to "large". But
> the model diagnostics check do look good.
> Model does not check good with neg binom, just like with Poisson (there is
> no overdisperson anyways)
> Will check about the individual observation level term.
> Any other reading suggestions about heteroscedasticity vs normality?
> Thank you again
> On Mon, Nov 7, 2016 at 4:56 PM, Tom Wilding <Tom.Wilding at sams.ac.uk> wrote:
>> Hi Luciana - if your count data is large (not 'near' zero) then the normal
>> model might be fine and you could then account for heteroscedasticity using
>> GLS (as you seem to have done) - if the model diagnostics check-out then it
>> should be OK.  You could log or log+1 transform your response and see how
>> that looks too, just for interest (if you have zeros then this approach is
>> not likely to be successful).  Also, you could stick with the Poisson GLMM
>> and include an individual observation level term (Elston, D. A., et al.
>> (2001). "Analysis of aggregation, a worked example: numbers of ticks on red
>> grouse chicks." Parasitology 122(05): 563-569), - this should also be
>> reasonable (and is very easy to implement) and might be of interest (though
>> may have its detractors).  Your negative binomial solution should also
>> address the over-dispersion issue though I'm less sure what the residual
>> patterns should look like (you could fake-up some data to check these).
>> Best
>> Tom.
>> -----Original Message-----
>> From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org]
>> On Behalf Of Luciana Motta
>> Sent: 07 November 2016 14:17
>> To: r-sig-mixed-models at r-project.org
>> Subject: [R-sig-ME] Poisson or Gaussian when modelling count data +
>> heteroscedasticity in predictor variables
>> Hello,
>> my name is Lucy, and I'm studying richness of aquatic insect in lakes. I
>> took samples from different habitats in each lake, for which I though of a
>> mixed model with my predictors as fixed effects, and lake/habitat as random
>> effects. I did a model using "glmer", to be able to use Poisson
>> distribution for residuals, due to my type of response variable (count data
>> -richness).
>> But studying the data graphically, I suspected variance heterogeneity in 2
>> predictors.
>> I continued doing model selection with glmer with Poisson distribution,
>> but also made a model using "lmer" (therefore, Gaussian distribution of
>> residuals), to be able to model variance heterogeneity of those predictors
>> and see if models fit better with it.
>> Finally, yes..."lmer" model, with Gaussian distribution and varExp
>> modelling for the variance of those predictors seem much more adequate than
>> the "glmer" with Poisson (conclusion I arrived to by studying residuals,
>> fitted values, qqplot and normality tests).
>> Can heteroscedasticity be a larger problem to be accounted for, than the
>> distribution of the errors for count data? I read that sometimes
>> heteroscedasticity can be masking what we think is a normality problem.
>> Also that Poisson distribution accounts for heteroscedasticity....but in my
>> case, model seems much worse. Is just that since Poisson, Neg.Binom. etc.,
>> is so recommended for count data, that I don't really know if I'm plain
>> wrong in even considering staying with Gaussian. Any suggestions/further
>> readings about this?
>> Thank you very much,
>> --
>> Luciana M. Motta
>> Licenciada en Cs. Biológicas FCEyN, U.B.A.
>> [[alternative HTML version deleted]]
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> The Scottish Association for Marine Science (SAMS) is registered in
>> Scotland as a Company Limited by Guarantee (SC009292) and is a registered
>> charity (9206). SAMS has two actively trading wholly owned subsidiary
>> companies: SAMS Research Services Ltd (SC224404) and SAMS Ltd (SC306912).
>> All Companies in the group are registered in Scotland and share a
>> registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The
>> content of this message may contain personal views which are not the views
>> of SAMS unless specifically stated. Please note that all email traffic is
>> monitored for purposes of security and spam filtering. As such individual
>> emails may be examined in more detail.

More information about the R-sig-mixed-models mailing list