[R-sig-eco] proportion data with many zeros

v_coudrain at voila.fr v_coudrain at voila.fr
Sun Feb 3 16:10:22 CET 2013


Thank you Liz, 
I don't know tweedie, I'll have a look at it, but I have indeed some high values. I know about the problems linked to the arcsine transformation. I won't consider it 
anyway. I'd like to use either the raw values of pollen grain counts or a logistic quasibinomial model. 
Best,
Valérie


> Message du 02/02/13 à 20h47
> De : "Liz Pryde" 
> A : "v_coudrain at voila.fr" 
> Copie à : "Cade Brian" , "r-sig-ecology at r-project.org" 
> Objet : Re: [R-sig-eco] proportion data with many zeros
> 
> Have you plotted the raw data to have a look at the distribution?
> You could try another exponential family distribution like tweedie that has a mass at zero but is otherwise similar to poisson/gamma - so you're directly 
modeling the zeroes. It won't work if you have a lot of high values though. 
> Proportions are tricky. Have a read of the Warton paper (2012/11?) "the arcsine is asinine".
> 
> Liz
> 
> 
> 
> On 02/02/2013, at 6:34 PM, v_coudrain at voila.fr wrote:
> 
> > Thank you very much for this suggestion. In fact I reconsidered my question and I am not sure that zero-inflated model is what I need. If I understood it 
properly, 
> > a zero-inflated model is best suited when we don't know if zero values are true or false absences (right?). In my case all zero values are assumed to be real 
> > absence and are therefore informative. However, fitting quasipoisson on raw counts or quasibinomial on proportion gives me awful distributions of residuals 
and 
> > meaningless results. 
> > 
> > Valérie
> > 
> > 
> >> Message du 01/02/13 à 17h22
> >> De : "Cade, Brian" 
> >> A : v_coudrain at voila.fr
> >> Copie à : r-sig-ecology at r-project.org
> >> Objet : Re: [R-sig-eco] proportion data with many zeros
> >> 
> >> For a fully parametric approach, you might want to use of zero-inflated
> >> beta distribution (e.g., as available in gamlss package), which is designed
> >> for zero-inflated proportions. Or for a semi-parametric approach, you
> >> could estimated a sequence of quantile regression estimates (e.g., in
> >> package quantreg), where some interval (hopefully not to large) of the
> >> quantiles will be uninformative because they are massed at the zero values.
> >> 
> >> Brian
> >> 
> >> Brian S. Cade, PhD
> >> 
> >> U. S. Geological Survey
> >> Fort Collins Science Center
> >> 2150 Centre Ave., Bldg. C
> >> Fort Collins, CO 80526-8818
> >> 
> >> email: brian_cade at usgs.gov
> >> tel: 970 226-9326
> >> 
> >> 
> >> 
> >> On Fri, Feb 1, 2013 at 1:30 AM, wrote:
> >> 
> >>> Dear all, I am trying to test how the proportion of pollen of different
> >>> plants found in the brood cells of a wild bee changes over time. I
> >>> conducted 4 sampling sessions
> >>> (thus time is a factor with 4 levels) and collected several pollen samples
> >>> for each time point (300 pollen grains counted for each sample). I thought
> >>> about applying a
> >>> quasi-binomial glm:
> >>> 
> >>> y = cbind(total pollen - pollen of plant X, pollen of plant X)
> >>> 
> >>> glm(y~time, family=quasibinomial)
> >>> 
> >>> The problem is that I have a lot of zero value, because the pollen of some
> >>> plants only occurred rarely or very clumped in time. I thought about
> >>> applying a zero-inflated
> >>> model, but I have never used it and I am not sure if it is suitable for
> >>> proportion data. Additionally I wondered if I have to consider the fact
> >>> that I don't have the same
> >>> number of pollen sample for each date, which makes my design unbalanced.
> >>> Thank you in advance for advice.
> >>> 
> >>> Best wishes
> >>> Valérie
> >>> ___________________________________________________________
> >>> CAN 2013 : résultats et matchs en direct à suivre sur Voila.fr
> >>> http://sports.voila.fr/football/can/
> >>> 
> >>> _______________________________________________
> >>> R-sig-ecology mailing list
> >>> R-sig-ecology at r-project.org
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> > 
> > ___________________________________________________________
> > CAN 2013 : résultats et matchs en direct à suivre sur Voila.fr http://sports.voila.fr/football/can/
> > 
> > _______________________________________________
> > R-sig-ecology mailing list
> > R-sig-ecology at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 

___________________________________________________________
CAN 2013 : résultats et matchs en direct à suivre sur Voila.fr http://sports.voila.fr/football/can/



More information about the R-sig-ecology mailing list