[R-sig-ME] choice of prediction error calculations for zero inflated model

Sat Mar 30 13:49:26 CET 2019

Hello All,

thought I would reach out to see if you have some guidance on the following: I am working with a zero inflated data set and fitting models that should be reasonable to model such data (zeroinfl() and hurdle() from pscl, mixtures from flexmix, glm.nb, etc) and trying to compare model predictive performance based on a validation data set (70% of all data was used to train and the renaming 30% for validation)... This is not necessarily a coding question but rather a stat oriented perhaps although a working example would be helpful, if there is one: I have looked extensively to see what is in the literature for measures of predictive performance of zero inflated models based on a validation data set to compare observed vs predicted responses for count data, but could not come up with much. I am familiar with general measures of performance, like RMSE, MAE, etc but finding not much on it's appropriateness for use in my setting. Some references point towards adjusted/pseudo r squared approaches but most references are to evaluate model fit during model development vs predictive performance using a validation set... Any thoughts, directions you may be able to help me with? 

much appreciate your input...

thanks,

Andras