[R-sig-eco] Low counts

Nicholas Lewin-Koh nikko at hailmail.net
Tue Feb 2 17:40:15 CET 2010


Hi Tore,
What your seeing in the residuals may just be due to the "discreetness"
of count data.
Gordon Smyth has a nice paper on this topic (and code in the statmod
package):
Dunn, P. K., and Smyth, G. K. (1996). Randomized quantile residuals.
Journal of Computational and Graphical Statistics 5, 236-244. 
In general the "stars at night" is what you want to see in residuals,
but often,
especially in the case of small counts the Poisson or binomial may be
just fine,
but the residuals will have that striping effect because you have a very
small mean
and hence not much variation to spread the data out. Before you go
running
to a zero inflated model, or a glmm, squint your eyes and look at the
results
you are getting from the simple glm. Is the model adequate? does it
describe the 
data sufficiently? Is the improvement in the likelihood huge between the
Poisson and
the quasipoisson? Especially if you don't have a lot of data, all these
fancy
models may not be worth the cost of estimating all the extra parameters.
But at
the end of the day, since we can't see the data, you need to make the
call.

Cheers
Nicholas

 
> Message: 1
> Date: Mon, 1 Feb 2010 15:04:22 +0100
> From: "Tore Chr Michaelsen" <tore.michaelsen at bio.uib.no>
> To: <r-sig-ecology at r-project.org>
> Subject: [R-sig-eco] Low counts
> Message-ID: <000801caa347$74c62a70$5e527f50$@michaelsen at bio.uib.no>
> Content-Type: text/plain;       charset="us-ascii"
> 
> Dear members;
> 
> 1) I have fitted a glm to count data (using quasipoisson to correct for
> disp.). In the final model, the relationship between Res and Fitted (i.e.
> the line going through the plot) and QQ looks fine, but I am worried that
> low count (one to five) could violate some assumption of the glm/poisson:
> Although the line in the Res vs Fitted plot looks nice, the values show a
> clear pattern (five diagonal lines = the counts). Crawley/R book says it
> should look like the sky at night with no patterns. I assume patterns are
> not visible with large counts (e.g. 0-100), but highly visible with low
> counts as in this case. I still assume this is reason for some concern
> about
> the model, or is the concern not justified?
> 
> 2) Any recommendations on literature regarding model inspection in R.
> 
> Thank you for reading this mail!
> 
> Best wishes;
> Tore
> 
> 
>



More information about the R-sig-ecology mailing list