[R-sig-ME] Unbalanced presence/absence data

Andrew J Tyre atyre2 at unlnotes.unl.edu
Tue Feb 3 15:41:32 CET 2009

Hi Anna,

if your covariates are at the site level, then I suggest reducing your 
sample to a pure binomial case - counts of individuals with and without 
parasites. This is exactly the case when you will run into large amounts 
of overdispersion, because between individual differences in 
susceptibility and exposure within sites lead to larger than binomial 
variation between sites. However, you can at least partially account for 
this by including a random effect of site in the model - this leads to the 
"normal-binomial" model discussed in earlier posts (how do you all find 
those earlier posts?). 


Drew Tyre

School of Natural Resources
University of Nebraska-Lincoln
416 Hardin Hall, East Campus
3310 Holdrege Street
Lincoln, NE 68583-0974

phone: +1 402 472 4054 
fax: +1 402 472 2946
email: atyre2 at unl.edu

"Renwick, A. R." <a.renwick at abdn.ac.uk> 
Sent by: r-sig-mixed-models-bounces at r-project.org
02/03/2009 08:33 AM

"'r-sig-mixed-models at r-project.org'" <r-sig-mixed-models at r-project.org>

[R-sig-ME] Unbalanced presence/absence data

I am trying to analyse some data I have on the presence/absence of 
parasite infestation on small mammals using a GLMM, however I have a 
severely unbalanced data set in that I have a large number of 0's compared 
to 1's (i.e. 1333 0's and 86 1's).

The response variable (presence/absence) is at the individual level 
whereas all the explanatory variables (apart from sex) are at the site 
level.  This means that a lot of the individuals have exactly the same 
combination of all explanatory variables and when there is so many 
individuals with 0's it leaves very little power.

When I reduce the model I find that I can remove a number of interactions 
terms without really affecting the AIC which lead me to be slightly 

One option would be to analyses the data at the site level, i.e parasite 
prevalence, rather than the probability of being infested.

Any advice as to how to deal with this unbalanced data set would be very 
much appreciated.

Anna Renwick
Institute of Biological & Environment Sciences
University of Aberdeen
Zoology Building
Tillydrone Avenue
AB24 2TZ

The University of Aberdeen is a charity registered in Scotland, No 

R-sig-mixed-models at r-project.org mailing list

More information about the R-sig-mixed-models mailing list