[R-sig-ME] Incorporating a Temporal Correlation Structure in a GLM
Kevin J. Ryan
Kevin_J_Ryan at umit.maine.edu
Tue Feb 28 17:44:15 CET 2012
Thank you all very much for taking the time to give advice on my statistical issues. So far I have run logistic regression models using glm, lmer, and glmmPQL. I used pacf to look at autocorrelation of the residuals of these models and they do not
appear to be so (assuming pacf is suitable for use on residuals of a logistic regression). Something may be going wrong with the lmer model however. (The output of which is below this message.) I included Toad as a random effect and the
variance and SD are output as 0. Perhaps because of this, the coefficients of the glm model and the lmer model are exactly the same.
So if my residuals are okay, then perhaps an ordinary glm (pooling all toads) is the way to go. I would have liked to model proportion of toads emerged but I only had two monitoring devices, which more often than not were not deployed
Thanks again everyone,
Generalized linear mixed model fit by the Laplace approximation
Formula: Emergence ~ Tavg + (1 | Toad)
AIC BIC logLik deviance
481.2 493 -237.6 475.2
Groups Name Variance Std.Dev.
Toad (Intercept) 0 0
Number of obs: 371, groups: Toad, 16
Estimate Std. Error z value Pr(>|z|)
(Intercept) -5.14598 0.95134 -5.409 6.33e-08 ***
Tavg 0.07202 0.01392 5.174 2.29e-07 ***
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
Anthony R Ives <arives at wisc.edu> writes:
>It is not clear to me that this problem requires accounting for
>temporal autocorrelation, although I might be missing something. You
>say that the weather variables are autocorrelated, but if these are
>used as predictor variables, this doesn't necessarily mean that you
>need a model incorporating autocorrelation; autocorrelation of the
>errors (residuals) is what matters. Also, it is not clear to me why
>you would want to use logistic regression on each individual
>separately. I would suspect that there is correlation among
>individuals beyond that explained by the weather variables you
>included. It might be simpler to analyze all individuals together
>(i.e., proportion of emergences on a given day).
>That being said, there are three more approaches to Ben's list of
>logistic regression with temporal autocorrelation:
>1. You could use an extended Kalman filter with a measurement
>equation accounting for the variance structure of a binary process.
>An advantage here is that it is simple to include gaps in the
>observations. I have seen this done in the literature, but a quick
>check didn't turn up a reference.
>2. There is a largish literature on integer-valued ARMA models,
>though I don't know of code that will do this easily.
>3. With colleagues, I've worked out two flavors of logistic
>regression with phylogenetic correlations. These could be used by
>replacing the phylogenetic covariance matrix with a autocovariance
>All of these will require a little custom programming.
>On Feb 21, 2012, at 7:15 PM, Ben Bolker wrote:
>> Kevin J. Ryan <Kevin_J_Ryan at ...> writes:
>>> I'm attempting to use mixed-model logistic regression to model
>>> spadefoot emergence as a function of weather variables (individuals
>>> are monitored continuously from 1-84 days [with gaps]). However,
>>> the weather variables are serially autocorrelated, apparently at a
>>> lag of 12 days or so. Does anyone have experience incorporating a
>>> temporal autocorrelation structure of predictor variables into a
>>> glm? I've been examining the lme4 package but it does not appear to
>>> be able to do this.
>> A couple of quick thoughts:
>> * you could use glmmPQL (in the MASS package), which does allow any
>> of the correlation structures that are defined in the nlme
>> package (including corCAR1, which allows for gappy data). This
>> is not preferred for binary data, but probably (?) correcting
>> for correlation and using a slightly questionable estimation method
>> is better than ignoring correlation.
>> * if your responses are measured without error you might
>> be able to use emergences at a previous time point as
>> a predictor.
>> * you could just use glm (or whatever) and evaluate the correlations
>> among the residuals -- if there's nothing going on there then you
>> have a reasonable excuse for proceeding without a correlation model.
>> * the fact that the _predictor_ variables are autocorrelated isn't
>> that much of a big deal -- it's really the response (or rather the
>> residuals of the response) that you should be worried about, although
>> there is always a bit of an issue in time-series analysis in
>> looking at relationships of autocorrelated series with other
>> autocorrelated series ...
>> * generalized estimating equations (GEE: see geepack etc.) are
>> another approach, although I don't know if any of the R packages
>> that do GEEs have an option for autocorrelations on unevenly
>> spaced data (try installing the "sos" package and searching
>> via something like findFn("gee uneven"))
>> * in my opinion the gold standard (if the data are rich enough
>> to warrant it) is to build a hierarchical model with a latent
>> normally distributed variable with temporal autocorrelation and
>> an observed binary variable (emergence) on top of it, but this
>> is fairly hard work -- you'd need AD Model Builder or some
>> dialect of BUGS.
>> I will be interested to see if anyone has better suggestions.
>> I would check the books from Highland Statistics (Zuur et al.)
>> to see if they have anything useful ...
>> R-sig-mixed-models at r-project.org mailing list
>Anthony Ragnar Ives
>Department of Zoology
> [[alternative HTML version deleted]]
>R-sig-mixed-models at r-project.org mailing list
More information about the R-sig-mixed-models