[R] hopelessly overdispersed?

Sat Aug 27 09:18:03 CEST 2011

dear list!
i am running an anlysis on proportion data using binomial (quasibinomial 
family) error structure. My data comprises of two continuous vars, body 
size and range size, as well as of feeding guild, nest placement, nest 
type and foragig strata as factors. I hope to model with these variables 
the preference of primary forests (#successes) by certain bird species. 
My code therefore looks like:

y<-cbind(n_forest,n_trials-n_forest)
model<-glm(y~range+body+nstrata+ntype+forage+feed,family=quasibinomial(link=logit),data=dat)

however plausible the approach may look, overdispersion is prevalent 
(dispersion estimated at 6.5). I read up on this and learned that in 
case of multiple factors, not all levels may yield good results with 
logistic regression (Crawley "The R Book"). I subsequently try to 
analyse each feeding guild seperately, but to no avail.overdispersion 
remains. Given the number of categorical variables in my study, is there 
a convenient way to handle the overdispersion? I was trying tree models 
to see the most influential variables but again, to no avail.

BTW: It may well be that the data is just bad...

thanks a lot!