[R-sig-ME] underdispersion, (Poisson vs Gaussian)

Luciana Motta tasmacetus at gmail.com
Tue Nov 8 00:14:02 CET 2016


Dear Ben Bolker

as you correctly aim, I checked dispersIon parameter, and I actually have
underdispersion:

verdisp_fun(Model.Poiss)

chisq      ratio          p
16.3184448  0.4594328  0.9991835

In this regard, I checked collineality of variable or outliers, something
Zuur (2009) points that could be a reason for this, and that is not serious
in my data either.

As regards Mr Ives comments: I have 6 richness values in each lake, thru 8
lakes: 48 observations. Though I understand what you mention about main
statistical use of GLMs to account for heteroscedasticity, the use of
gaussian instead of Poisson would mean it will treat my response variable
as "continuous" when it is not...and that is something I don't complete to
understand. Sorry, I probably have more studying to do in that regard. Will
check Warton article.

Many thanks!
Luciana

On Mon, Nov 7, 2016 at 5:49 PM, Anthony R. Ives <arives at wisc.edu> wrote:

> Luciana,
>
> Although it is always hard to say, 5 to 20 is not necessarily "small".
> Really, it is all about the diagnostics, which you say point to using a
> linear model. In your original email you made a distinction
> heteroscedasticity and "count data". This really isn't a distinction,
> because the main statistical thing GLMs do is to account for
> heteroscedasticity being driven by different sampling distributions (i.e.,
> the variance-mean relationship. If you have enough data points to identify
> heteroscedasticity, then I think the heteroscedasticity should be your
> focus. Note, however, that estimating heteroscedasticity and incorporating
> this into your analysis can be problematic for small counts. You are very
> right to worry about heteroscedasticity in general, which can play havoc
> with type I error.
>
> Linear models (with GLS to account for heteroscedasticity) can have fine
> performance in terms of type I errors. There can be a loss of power, but
> not always. You might find a recent paper useful: Warton et al. 2016
> http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12552/abstract
>
> Cheers, Tony
>
>
> -----Original Message-----
> From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on
> behalf of Luciana Motta <tasmacetus at gmail.com>
> Date: Monday, November 7, 2016 at 10:28 AM
> To: Tom Wilding <Tom.Wilding at sams.ac.uk>
> Cc: "r-sig-mixed-models at r-project.org" <r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] Poisson or Gaussian when modelling count data +
> heteroscedasticity in predictor variables
>
>     Thank you Tom.
>
>     My richness data go from 5 to 20, don't think that applies to "large".
> But
>     the model diagnostics check do look good.
>
>     Model does not check good with neg binom, just like with Poisson
> (there is
>     no overdisperson anyways)
>
>     Will check about the individual observation level term.
>
>     Any other reading suggestions about heteroscedasticity vs normality?
>
>     Thank you again
>
>
>
>
>     On Mon, Nov 7, 2016 at 4:56 PM, Tom Wilding <Tom.Wilding at sams.ac.uk>
> wrote:
>
>     > Hi Luciana - if your count data is large (not 'near' zero) then the
> normal
>     > model might be fine and you could then account for
> heteroscedasticity using
>     > GLS (as you seem to have done) - if the model diagnostics check-out
> then it
>     > should be OK.  You could log or log+1 transform your response and
> see how
>     > that looks too, just for interest (if you have zeros then this
> approach is
>     > not likely to be successful).  Also, you could stick with the
> Poisson GLMM
>     > and include an individual observation level term (Elston, D. A., et
> al.
>     > (2001). "Analysis of aggregation, a worked example: numbers of ticks
> on red
>     > grouse chicks." Parasitology 122(05): 563-569), - this should also be
>     > reasonable (and is very easy to implement) and might be of interest
> (though
>     > may have its detractors).  Your negative binomial solution should
> also
>     > address the over-dispersion issue though I'm less sure what the
> residual
>     > patterns should look like (you could fake-up some data to check
> these).
>     >
>     > Best
>     >
>     > Tom.
>     >
>     >
>     >
>     > -----Original Message-----
>     > From: R-sig-mixed-models [mailto:r-sig-mixed-models-
> bounces at r-project.org]
>     > On Behalf Of Luciana Motta
>     > Sent: 07 November 2016 14:17
>     > To: r-sig-mixed-models at r-project.org
>     > Subject: [R-sig-ME] Poisson or Gaussian when modelling count data +
>     > heteroscedasticity in predictor variables
>     >
>     > Hello,
>     >
>     > my name is Lucy, and I'm studying richness of aquatic insect in
> lakes. I
>     > took samples from different habitats in each lake, for which I
> though of a
>     > mixed model with my predictors as fixed effects, and lake/habitat as
> random
>     > effects. I did a model using "glmer", to be able to use Poisson
>     > distribution for residuals, due to my type of response variable
> (count data
>     > -richness).
>     >
>     > But studying the data graphically, I suspected variance
> heterogeneity in 2
>     > predictors.
>     >
>     > I continued doing model selection with glmer with Poisson
> distribution,
>     > but also made a model using "lmer" (therefore, Gaussian distribution
> of
>     > residuals), to be able to model variance heterogeneity of those
> predictors
>     > and see if models fit better with it.
>     >
>     > Finally, yes..."lmer" model, with Gaussian distribution and varExp
>     > modelling for the variance of those predictors seem much more
> adequate than
>     > the "glmer" with Poisson (conclusion I arrived to by studying
> residuals,
>     > fitted values, qqplot and normality tests).
>     >
>     > Can heteroscedasticity be a larger problem to be accounted for, than
> the
>     > distribution of the errors for count data? I read that sometimes
>     > heteroscedasticity can be masking what we think is a normality
> problem.
>     > Also that Poisson distribution accounts for
> heteroscedasticity....but in my
>     > case, model seems much worse. Is just that since Poisson, Neg.Binom.
> etc.,
>     > is so recommended for count data, that I don't really know if I'm
> plain
>     > wrong in even considering staying with Gaussian. Any
> suggestions/further
>     > readings about this?
>     >
>     > Thank you very much,
>     >
>     > --
>     > Luciana M. Motta
>     > Licenciada en Cs. Biológicas FCEyN, U.B.A.
>     >
>     > [[alternative HTML version deleted]]
>     >
>     > _______________________________________________
>     > R-sig-mixed-models at r-project.org mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     > The Scottish Association for Marine Science (SAMS) is registered in
>     > Scotland as a Company Limited by Guarantee (SC009292) and is a
> registered
>     > charity (9206). SAMS has two actively trading wholly owned subsidiary
>     > companies: SAMS Research Services Ltd (SC224404) and SAMS Ltd
> (SC306912).
>     > All Companies in the group are registered in Scotland and share a
>     > registered office at Scottish Marine Institute, Oban Argyll PA37
> 1QA. The
>     > content of this message may contain personal views which are not the
> views
>     > of SAMS unless specifically stated. Please note that all email
> traffic is
>     > monitored for purposes of security and spam filtering. As such
> individual
>     > emails may be examined in more detail.
>     >
>
>
>
>     --
>     Luciana M. Motta
>     Licenciada en Cs. Biológicas FCEyN, U.B.A.
>     CENAC (Parque Nacional Nahuel Huapi) - CONICET
>     Argentina
>     www.cenacbariloche.com.ar
>
>         [[alternative HTML version deleted]]
>
>     _______________________________________________
>     R-sig-mixed-models at r-project.org mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>


-- 
Luciana M. Motta
Licenciada en Cs. Biológicas FCEyN, U.B.A.
CENAC (Parque Nacional Nahuel Huapi) - CONICET
Argentina
www.cenacbariloche.com.ar

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list