[R] Logistic regression model selection with overdispersed/autocorrelated data

Frank E Harrell Jr f.harrell at vanderbilt.edu
Mon Jan 30 22:37:23 CET 2006


Jesse.Whittington at pc.gc.ca wrote:
> 
> I am creating habitat selection models for caribou and other species with
> data collected from GPS collars.  In my current situation the radio-collars
> recorded the locations of 30 caribou every 6 hours.  I am then comparing
> resources used at caribou locations to random locations using logistic
> regression (standard habitat analysis).
> 
> The data is therefore highly autocorrelated and this causes Type I error
> two ways â€“ small standard errors around beta-coefficients and
> over-paramaterization during model selection.  Robust standard errors are
> easily calculated by block-bootstrapping the data using â€œanimalâ€ as a
> cluster with the Design library, however I havenâ€™t found a satisfactory
> solution for model selection.
> 
> A couple options are:
> 1.  Using QAIC where the deviance is divided by a variance inflation factor
> (Burnham & Anderson).  However, this VIF can vary greatly depending on the
> data set and the set of covariates used in the global model.
> 2.  Manual forward stepwise regression using both changes in deviance and
> robust p-values for the beta-coefficients.
> 
> I have been looking for a solution to this problem for a couple years and
> would appreciate any advice.
> 
> Jesse

If you must do non-subject-matter-driven model selection, look at the 
fastbw function in Design, which will use the cluster bootstrap variance 
matrix.

Frank

> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list