[R-sig-ME] Modelling count data in glmer with an apriori model selection

Tue Apr 18 02:51:33 CEST 2017

Hi All,

I am modeling bear distribution in Lao PDR, with sign count data collected
on transects, in glmer, using a degrees of freedom spending, apriori
modeling approach. I have calculated the number of degrees of freedom my
model can afford based on my effective sample size (i.e. number of line
transects), with degrees of freedoms calculated as the number of
non-intercept model-generated coefficients to be estimated. I have study
site as a random effect (n=7).

My objectives are to model bear occurrence as a function of covariates, to
rank those covariates in order of importance, and predict the distribution
of bears throughout the whole country (i.e extrapolate outside study
sites). This is my first experience with an apriori modelling strategy, and
i have a number of questions for which i have not found answers in the
published literature. I would be grateful for any advice anyone may have:

- how many degrees of freedom will including a 7-level random effect incur?

- My understanding is that i must pick my probability distribution (i.e.
Poisson, Neg Bin) apriori, and so i cannot use the usual post model checks
to determine is my chosen distribution was appropriate. Is this correct?

- My understanding is that i'll be penalized an extra degree of freedom by
using a Negative Binomial distribution. Is this correct?

- How do i decide between using a Poisson or a Negative binomial
distribution?  Is there some post hoc checks i can do, without exploring
the relationship between the response and the predictors, to inform my
decision?

(The literature tells me that count data are rarely Poisson distributed,
and that Negative binomial is the most common distribution that accounts
for over dispersion. I have ruled out zero-inflation; my response has
plenty of zero's, but i feel they they will be accounted for by the model
covariates).

- In the context of my study objectives, what are the consequences of using
a Poisson distribution when my data are really Negative Binomial (i.e. does
the distribution of the residuals of the response really matter?)?

Many thanks in advance for any insights you can offer.

Best wishes
Lorraine

-- 
Lorraine Scotson, PhD Candidate,
Department of Fisheries, Wildlife and Conservation Biology,
University of Minnesota, USA

Skype ID: lorrainescotson
Tel: +44141 6282079

	[[alternative HTML version deleted]]