[R-sig-ME] Poisson or Gaussian when modelling count data + heteroscedasticity in predictor variables

Mon Nov 7 16:56:43 CET 2016

Hi Luciana - if your count data is large (not 'near' zero) then the normal model might be fine and you could then account for heteroscedasticity using GLS (as you seem to have done) - if the model diagnostics check-out then it should be OK.  You could log or log+1 transform your response and see how that looks too, just for interest (if you have zeros then this approach is not likely to be successful).  Also, you could stick with the Poisson GLMM and include an individual observation level term (Elston, D. A., et al. (2001). "Analysis of aggregation, a worked example: numbers of ticks on red grouse chicks." Parasitology 122(05): 563-569), - this should also be reasonable (and is very easy to implement) and might be of interest (though may have its detractors).  Your negative binomial solution should also address the over-dispersion issue though I'm less sure what the residual patterns should look like (you could fake-up some data to check these).

Best

Tom.

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Luciana Motta
Sent: 07 November 2016 14:17
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Poisson or Gaussian when modelling count data + heteroscedasticity in predictor variables

Hello,

my name is Lucy, and I'm studying richness of aquatic insect in lakes. I took samples from different habitats in each lake, for which I though of a mixed model with my predictors as fixed effects, and lake/habitat as random effects. I did a model using "glmer", to be able to use Poisson distribution for residuals, due to my type of response variable (count data -richness).

But studying the data graphically, I suspected variance heterogeneity in 2 predictors.

I continued doing model selection with glmer with Poisson distribution, but also made a model using "lmer" (therefore, Gaussian distribution of residuals), to be able to model variance heterogeneity of those predictors and see if models fit better with it.

Finally, yes..."lmer" model, with Gaussian distribution and varExp modelling for the variance of those predictors seem much more adequate than the "glmer" with Poisson (conclusion I arrived to by studying residuals, fitted values, qqplot and normality tests).

Can heteroscedasticity be a larger problem to be accounted for, than the distribution of the errors for count data? I read that sometimes heteroscedasticity can be masking what we think is a normality problem. Also that Poisson distribution accounts for heteroscedasticity....but in my case, model seems much worse. Is just that since Poisson, Neg.Binom. etc., is so recommended for count data, that I don't really know if I'm plain wrong in even considering staying with Gaussian. Any suggestions/further readings about this?

Thank you very much,

--
Luciana M. Motta
Licenciada en Cs. Biológicas FCEyN, U.B.A.

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has two actively trading wholly owned subsidiary companies: SAMS Research Services Ltd (SC224404) and SAMS Ltd (SC306912). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail.