[R-sig-ME] Unbalanced presence/absence data
Renwick, A. R.
a.renwick at abdn.ac.uk
Tue Feb 3 15:22:22 CET 2009
I am trying to analyse some data I have on the presence/absence of parasite infestation on small mammals using a GLMM, however I have a severely unbalanced data set in that I have a large number of 0's compared to 1's (i.e. 1333 0's and 86 1's).
The response variable (presence/absence) is at the individual level whereas all the explanatory variables (apart from sex) are at the site level. This means that a lot of the individuals have exactly the same combination of all explanatory variables and when there is so many individuals with 0's it leaves very little power.
When I reduce the model I find that I can remove a number of interactions terms without really affecting the AIC which lead me to be slightly concerned.
One option would be to analyses the data at the site level, i.e parasite prevalence, rather than the probability of being infested.
Any advice as to how to deal with this unbalanced data set would be very much appreciated.
Institute of Biological & Environment Sciences
University of Aberdeen
The University of Aberdeen is a charity registered in Scotland, No SC013683.
More information about the R-sig-mixed-models