[R-sig-eco] modelling zero inflated continuous data using a two stage model in glmmpql - question about implementing the delta method (Philip Harrison)

Leone Brown leone.m.brown at gmail.com
Wed Feb 4 15:12:11 CET 2015


Hi Phil,
Not having looked in great detail at your models, but just the general
question - have you checked out the delta method functions in Ben
Bolker's emdbook package? I have found it very useful
(http://cran.r-project.org/web/packages/emdbook/emdbook.pdf).
Cheers,
Lee

> Date: Tue, 03 Feb 2015 16:21:18 -0500
> From: Philip Harrison <pharriso at uwaterloo.ca>
> To: r-sig-ecology at r-project.org
> Subject: Re: [R-sig-eco] modelling zero inflated continuous data using
>         a two stage model in glmmpql - question about implementing the delta
>         method
> Message-ID:
>         <20150203162118.20141bzmo53jwjwu at www.nexusmail.uwaterloo.ca>
> Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes";
>         format="flowed"
>
> Update,
>
> So it seems predictions for the delta method ARE as simple as the
> probabily*estimates, which for the example can pretty easily computed
> using the predict function (ive added my coding below)
>
> However does anyone know how to compute confidence intervals with this
> delta method? Im guessing its not as straightforward as the product of
> the confidence intervals for the two models which are easily computed?
>
> Any help would be greatly appreciated?
>
> Cheers
>
> Phil
>
> Code
>
> #### generate a grid for prediction which is object "predition grid"
>
>
> library(AICmodavg)
>
> #generate probability predictions form the binomial model
>
> predoBin<-predictSE.lme(BinaryMod,predictiongrid,level=0,print.matrix=TRUE,se.fit=TRUE)
>
> predsBinG<-cbind(predictiongrid,predoBin)
>
> predsBinG$Probality<-plogis(predsBinG$fit)
> predsBinG$CIH.bin<-plogis(predsBinG$fit+(1.96*predsBinG$fit$se.fit))
> predsBinG$CIL.bin<-plogis(predsBinG$fit-(1.96*predsBinG$fit$se.fit))
>
>
> ###give the binomial model predictions an identifier
>
>
> ####generate (and backtransform from the cuberoot) predictions from
> the  positive model
>
> predoPos<-predictSE.lme(PositveMod,predictiongrid,level=0,print.matrix=TRUE,se.fit=TRUE)
> predsPosG<-cbind(PositiveModel,predoPos)
> predsPosG$PosWij<-(predsPosG$fit)^3
> predsBinG$CIH<-(predsPosG$PosWij+(1.96*predsPosG$se.fit))^3
> predsBinG$CIL<-(predsPosG$PosWij-(1.96*predsPosG$se.fit))^3
>
>
> DeltaMaker<-cbind(predBinG,predsPosG)
>
> #########get the conditional estimates
>
> DeltaMaker$ConditionalWij<-DeltaMaker$Probability*DeltaMaker$PosWij
>
>
> ####theorectical CI estimation (this is likely wrong???need to
> bootstrap or something?)
>
>
> DeltaMaker$ConditionalWijCIH<-DeltaMaker$CIH.bin*DeltaMaker$CIH
>
> DeltaMaker$ConditionalWijCIL<-DeltaMaker$CIL.bin*DeltaMaker$CIL
>
> ####im thinking maybe only using the CIs from the positive model might
> make sense?
>
> DeltaMaker$ConditionalWijCIH<-DeltaMaker$Probability*DeltaMaker$CIH
>
> DeltaMaker$ConditionalWijCIL<-DeltaMaker$Probability*DeltaMaker$CIL
>
>
> Quoting Philip Harrison <pharriso at uwaterloo.ca>:
>
>> Dear R-sig-ecology group,
>>
>> I have some telemetry data with a zero-inflated (semi-continuous)
>> response, which I modelled using a two stage process using
>> glmmpql/nmle. I chose nlme/glmmpql because a temporal correlation
>> structure is very important for my telemetry data and I can live
>> with pql estimation and no AICs if it means I can avoid winBUGS!.
>>
>> I am pretty happy with my models and I don?t want to use
>> ZIPglmms,ZINBglmms or ZAPglmms as the response is a pretty normal
>> continuous variable and not a count.
>>
>> So now, if possible I would like to combine the results of the two
>> models to make predictions about resource selection (with Confidence
>> intervals) for each of the levels of my fixed effects, using the
>> ?delta? approach, mentioned a few times on r-sig-eco including here
>>
>> https://stat.ethz.ch/pipermail/r-sig-ecology/2011-May/002159.html
>>
>> I found quite a few papers about the delta approach (typically using
>> abundance estimates as the response), but I am having trouble
>> translating the method into code (I?ve added refs at the end of the
>> message). I have seen papers where they appear to just multiply the
>> predictions from the positive model by the probability from the
>> first model (although the papers in question use Bayesian techniques
>> multiplying the posterior modes) Can I do the same with the
>> predictions from my frequentist approach, somehow that seems over
>> simplistic?
>>
>> Also the response for the positive part of my model normalizes
>> nicely with a cube root transformation and many of the papers use
>> log-normal positive responses- I?m assuming this doesn?t make any
>> difference as long as I leave the back transformation until the very
>> end?
>>
>> Any help would be much appreciated. See below for model details
>>
>> ###############################################################################
>> Models- First I model the probability of a non zero value, (i.e.
>> presence/absence) then model the continuous part i.e. the strength
>> of selection in the event of a positive selection event.  Sorry I
>> don?t have a real example dataset; however in this case I just need
>> to find out how to extract the relevant stuff from lme/glmmpql output.
>>
>> Wij ? this is my resource selection response variable  (Wij=
>> proportion of time spent in habitat/proportion of habitat
>> available). This response varies from 0 (i.e. habitat was available
>> but not used) continuously through 1 (neither selected nor avoided)
>> and continuously >1 (positive selection) up to a theoretical infinity.
>>
>> Seas; a categorical 4 level variable
>> Sequence; a variable created based on detection date used for the
>> correlation structure.
>> Habitat: a categorical habitat variable
>>
>> Xdata$WijBin<-ifelse(Xdata$Wij>0,1,0)
>> XdataUsed<- Xdata[Xdata$WijBIN0==1,]
>>
>> BinaryMod<-glmmpql(BinaryVariable~Seas+Habitat+Seas*Habitat,random=~1|FishID,cor=corCAR1(form=~Sequence),data=Xdata,family=?binomial?)
>>
>> PositiveModel <-lme((Wij)^(1/3)~
>> Seas+Habitat+Seas*Habitat,random=~1|FishID/TemperatureCategory,cor=corCAR1(form=~Sequence),data=XdataUsed)
>>



More information about the R-sig-ecology mailing list