[R] Logistic regression model selection with overdispersed/autocorrelated data
Jesse.Whittington@pc.gc.ca
Jesse.Whittington at pc.gc.ca
Tue Jan 31 17:09:00 CET 2006
Jesse.Whittington at pc.gc.ca wrote:
>
> I am creating habitat selection models for caribou and other species with
> data collected from GPS collars. In my current situation the
radio-collars
> recorded the locations of 30 caribou every 6 hours. I am then comparing
> resources used at caribou locations to random locations using logistic
> regression (standard habitat analysis).
>
> The data is therefore highly autocorrelated and this causes Type I error
> two ways â small standard errors around beta-coefficients and
> over-paramaterization during model selection. Robust standard errors are
> easily calculated by block-bootstrapping the data using âanimalâ as a
> cluster with the Design library, however I havenât found a satisfactory
> solution for model selection.
>
> A couple options are:
> 1. Using QAIC where the deviance is divided by a variance inflation
factor
> (Burnham & Anderson). However, this VIF can vary greatly depending on
the
> data set and the set of covariates used in the global model.
> 2. Manual forward stepwise regression using both changes in deviance and
> robust p-values for the beta-coefficients.
>
> I have been looking for a solution to this problem for a couple years and
> would appreciate any advice.
>
> Jesse
Frank E Harrell Jr wrote:
If you must do non-subject-matter-driven model selection, look at the
fastbw function in Design, which will use the cluster bootstrap variance
matrix.
Frank
Thanks for the tip. I didn't know that the fastbw function could account
for the clustered variance. For others, the code to run such a model from
the Design library would be:
model.1 <- lrm(y ~ x1+x2+x3+x4, data=data, x=T,y=T) # create model
model.2 <- bootcov(model.1, cluster=data$animal, B=10000) # calculate
robust variance matrix
fastbw(model.2) # backward
step-wise selection.
Later we will examine individual caribou responses to trails
(subject-specific model selection). For this we plan to use mixed effects
models (lmer). Is this what you would also recommend?
I look forward to reading the new edition of your book when it is
published.
Jesse
More information about the R-help
mailing list