[R-sig-eco] Negative binomial

Carsten Dormann carsten.dormann at ufz.de
Fri Oct 2 08:29:10 CEST 2009


Dear Joao,

I propose you do the following (and wait for the outcry-responses to 
this email to see if it is a reasonable proposal):

Fit your model with different types of distributions and compare their 
logLik-values:
logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=gaussian))
logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=poisson))
logLik(glm(y ~ x1+x2+x3+I(x1^2) + x1:x3, family=quasipoisson))
logLik(glm.nb(y ~ x1+x2+x3+I(x1^2) + x1:x3)) # require(MASS)

The model with the highest log-Likelihood is the distribution of choice 
and you can defend it against reviewer.

A few notes:
1. You obviously cannot do this when one of the models uses transformed 
responses (e.g. log(y)), because the LL will then be completely different.
2. When you use a more complex model (say a GLMM), you can approximate 
the neg.bin through a two-step procedure: 1. fit a (wrongly structured) 
glm.nb and extract the theta value from the summary of the model, say 
theta=4.5 (that is the second parameter of the neg.bin distribution). 
Then fit the GLMM again, giving as family the argument: 
negative.binomial(theta=4.5) (again from package MASS). The same holds 
for GAMs and other models requiring a specification of family.
3. You may want to dig around for books recommending the above 
procedure. I think I got this as advice from someone else, but haven't 
bothered yet to look it up (obviously MASS would be a good starting 
place, in their description of the neg.bin). I saw a paper that does 
this (using the minimum AIC but otherwise this approach), but it is not 
a statistical, but rather an ecological paper (although the analyst in 
the author group is a biometrician whom I full trust): Weigelt, A., 
Schumacher, J., Walther, T. Bartelheimer, M., Steinlein, T., Beyschlag, 
W. (2006) Identifying mechanisms of competition in multispecies 
communities. Journal of Ecology 95:53-64

HTH,

Carsten


Canning-Clode, Joao wrote:
> Hi all,
>
> 1st time user here!
> I am an ecologist working with marine fouling assemblages. I just got a paper back for revision. I am working with count data (species richness). I have used a linear model but the reviewers are recommending the use of negative binomial or Poisson. As far as I could understand from the literature these complex models should be used and the distribution is skewed left (lots of zeros). Well, my data is perfectly normal distributed. My main question is: can I still use negative binomial or poisson even if my data is normal? Does that make sense?
>
> Thanks in advance
>
> João Canning Clode, PhD
> Postdoctoral Fellow
> Marine Invasions Research Lab
> Smithsonian Environmental Research Center
> 647 Contees Wharf Road
> Edgewater, MD 21037
>
> Email: canning-clodej at si.edu
> Web: www.canning-clode.com
> Tel: 443-482-2354
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>   

-- 
Dr. Carsten F. Dormann
Department of Computational Landscape Ecology
Helmholtz Centre for Environmental Research-UFZ
Permoserstr. 15
04318 Leipzig
Germany

Tel: ++49(0)341 2351946
Fax: ++49(0)341 2351939
Email: carsten.dormann at ufz.de
internet: http://www.ufz.de/index.php?de=4205



More information about the R-sig-ecology mailing list