[R] Predicting hurdle model results on spatial scale

Lauria, Valentina valentina.lauria at nuigalway.ie
Mon Oct 21 18:28:47 CEST 2013


Dear List,

I apologise in advance for all my questions. 

I am interested to predict the habitat selection of fish species using the hurdle model. I know that I can perform this in R with the function predict.hurdle() on newdata, however how this work  is not entirely clear.

Usually with a two-step approach a binary and a poisson models are created to deal with zero-inflated and over-dispersed data, then the binary model is multiplied by the poisson model in order weight the predictions.  Is this already included in the predict.hurdle function? 

Also I am using the function dredge (from the MuMin package) to select my best model based on AIC, still in this case the best model selected seems to be a combination between the truncated poisson and the binary model (hurdle model). Is there any way that I could dredge the two model components separately? I did some research and in the NEWS section I found that a package pscf was created for this but when I did more digging around I did not have much luck.

I would be grateful if someone could help me. 
Thank you very much once again,
Valentina




-----Original Message-----
From: Achim Zeileis [mailto:Achim.Zeileis at uibk.ac.at] 
Sent: 18 October 2013 18:57
To: Lauria, Valentina
Cc: r-help at r-project.org
Subject: Re: [R] hurdle model error why does need integer values for the dependent variable?

On Fri, 18 Oct 2013, Lauria, Valentina wrote:

> Dear list,
>
> I am using the hurdle model for modelling the habitat of rare fish 
> species. However I do get an error message when I try to model my data:
>
>> test_new1<-hurdle(GALUMEL~ depth + sal + slope + vrm + lat:long + 
>> offset(log(haul_numb)), dist = "negbin", data = datafit_elasmo)
>
> Error in hurdle(GALUMEL ~ depth + sal + slope + vrm + lat:long + offset(log(haul_numb)),  :
>  invalid dependent variable, non-integer values
>
> When I do fit the same model with round(my dependent variable) the 
> model works. Sorry for the stupid question but could anyone explain me 
> why? My data are zero inflated (zeros occurring for 78%) and positively skewed.

hurdle() fits a count data distribution (poisson, negbin, geometric) by maximum likelihood. Hence, its response needs to be a count variable (i.e., integer). See vignette("countreg", package = "pscl") for the underlying likelihoods employed.

> Thank you very much in advance.
> Kind Regards,
> Valentina
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list