[R-sig-eco] proportion data with many zeros

v_coudrain at voila.fr v_coudrain at voila.fr
Sat Feb 2 08:34:28 CET 2013


Thank you very much for this suggestion. In fact I reconsidered my question and I am not sure that zero-inflated model is what I need. If I understood it properly, 
a zero-inflated model is best suited when we don't know if zero values are true or false absences (right?). In my case all zero values are assumed to be real 
absence and are therefore informative. However, fitting quasipoisson on raw counts or quasibinomial on proportion gives me awful distributions of residuals and 
meaningless results. 

Valérie


> Message du 01/02/13 à 17h22
> De : "Cade, Brian" 
> A : v_coudrain at voila.fr
> Copie à : r-sig-ecology at r-project.org
> Objet : Re: [R-sig-eco] proportion data with many zeros
> 
> For a fully parametric approach, you might want to use of zero-inflated
> beta distribution (e.g., as available in gamlss package), which is designed
> for zero-inflated proportions. Or for a semi-parametric approach, you
> could estimated a sequence of quantile regression estimates (e.g., in
> package quantreg), where some interval (hopefully not to large) of the
> quantiles will be uninformative because they are massed at the zero values.
> 
> Brian
> 
> Brian S. Cade, PhD
> 
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO 80526-8818
> 
> email: brian_cade at usgs.gov
> tel: 970 226-9326
> 
> 
> 
> On Fri, Feb 1, 2013 at 1:30 AM,  wrote:
> 
> > Dear all, I am trying to test how the proportion of pollen of different
> > plants found in the brood cells of a wild bee changes over time. I
> > conducted 4 sampling sessions
> > (thus time is a factor with 4 levels) and collected several pollen samples
> > for each time point (300 pollen grains counted for each sample). I thought
> > about applying a
> > quasi-binomial glm:
> >
> > y = cbind(total pollen - pollen of plant X, pollen of plant X)
> >
> > glm(y~time, family=quasibinomial)
> >
> > The problem is that I have a lot of zero value, because the pollen of some
> > plants only occurred rarely or very clumped in time. I thought about
> > applying a zero-inflated
> > model, but I have never used it and I am not sure if it is suitable for
> > proportion data. Additionally I wondered if I have to consider the fact
> > that I don't have the same
> > number of pollen sample for each date, which makes my design unbalanced.
> > Thank you in advance for advice.
> >
> > Best wishes
> > Valérie
> > ___________________________________________________________
> > CAN 2013 : résultats et matchs en direct à suivre sur Voila.fr
> > http://sports.voila.fr/football/can/
> >
> > _______________________________________________
> > R-sig-ecology mailing list
> > R-sig-ecology at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> >
> 

___________________________________________________________
CAN 2013 : résultats et matchs en direct à suivre sur Voila.fr http://sports.voila.fr/football/can/



More information about the R-sig-ecology mailing list