[R-sig-ME] Logistic regression with spatial autocorrelation structure

Douglas Bates bates at stat.wisc.edu
Mon Jan 17 18:07:05 CET 2011

On Mon, Jan 17, 2011 at 8:33 AM, Ben Bolker <bbolker at gmail.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 11-01-17 09:03 AM, Arnaud Mosnier wrote:
>> Dear list,
>>
>> Is there a way in R to do a mixed logistic regression with a spatial
>> autocorrelation structure ?
>>
>
>  The only straightforward solution I know of is via glmmPQL (MASS
> package), although you should be careful (e.g. try fitting to simulated
> data where you know the answer first) because penalized quasi-likelihood
> is biased for binary data.

As in previous discussions of modeling marginal variance-covariance
structures (as in adding spatial correlation) for the response in a
generalized linear mixed model, I think it is best to consider the
model carefully to ensure that it is sensible.

For a linear model where the distribution of the response is
multivariate Gaussian or for a linear mixed model where the
conditional distribution of the response, given the random effects, is
multivariate Gaussian, it is possible to model the variance-covariance
of this distribution separately from the model for the mean.  In the
case of a logistic GLM or GLMM the only way that I know how to make
sense of the model is that the distribution of the response (for known
values of the parameter) or the marginal distribution of the response,
given the random effects and model parameters, is a vector of
independent Bernoulli or binomial distributions.

Because the iterative scheme for determining the parameter estimates
in a GLM (or the conditional parameter estimates and the modes of the
random effects in a GLMM) is based on a weighted least squares
calculation, it is sometimes assumed that this can be converted to a
generalized least squares calculation.  You can certainly do this but
it doesn't reflect a model that I can describe.

It may be that I don't know enough about generalized linear models to
understand how one would incorporate spatial correlation in such a
model but I am somewhat suspicious of the practice.  In discussing
this with Jun Zhu, our local expert on spatial statistics, she
suggested that a preferred practice is to build the spatial
correlation into the distribution of the random effects, say by having
one random effect per location, and that would make sense to me
because the marginal distribution of the random effects is
multivariate Gaussian.

>   One of the generalized estimating equation packages (geepack, etc.)
> may also work.  If you have only spatial autocorrelation (i.e. no random
> block effects) then the geoRglm and spatcounts packages may also be of use.
>
>  Ben Bolker
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk00UzEACgkQc5UpGjwzenMJ4wCeJxVyJLDWXP2hxUZ9MpXxeTsN
> pxwAn1t/pb1eduoBg1+WF480SzjvLAqz
> =yvA0
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>