[R] Logistic regression model selection with overdispersed/autocorrelated data

Jesse.Whittington@pc.gc.ca Jesse.Whittington at pc.gc.ca
Mon Jan 30 17:22:45 CET 2006

I am creating habitat selection models for caribou and other species with
data collected from GPS collars.  In my current situation the radio-collars
recorded the locations of 30 caribou every 6 hours.  I am then comparing
resources used at caribou locations to random locations using logistic
regression (standard habitat analysis).

The data is therefore highly autocorrelated and this causes Type I error
two ways â€“ small standard errors around beta-coefficients and
over-paramaterization during model selection.  Robust standard errors are
easily calculated by block-bootstrapping the data using â€œanimalâ€ as a
cluster with the Design library, however I havenâ€™t found a satisfactory
solution for model selection.

A couple options are:
1.  Using QAIC where the deviance is divided by a variance inflation factor
(Burnham & Anderson).  However, this VIF can vary greatly depending on the
data set and the set of covariates used in the global model.
2.  Manual forward stepwise regression using both changes in deviance and
robust p-values for the beta-coefficients.

I have been looking for a solution to this problem for a couple years and
would appreciate any advice.


More information about the R-help mailing list