[R-sig-eco] Summary of the Answers I got for my post "How to incorporate spatial autocorrelation in multivariate GLM "

Sat Sep 12 14:06:33 CEST 2015

Hello,

A few days ago I posted the following question, and got the answers below:

*Dear friends,*

*I would like to ask for some advice.*

*I am embarking in the analysis of 3,000 plant species occurrence data
across biogeographic scales in South America. I am willing to try to jump
from more traditional distance-based multivariate analysis (e.g., RDA on
hellinger-transformed abundance data) to multivariate GLM as proposed by
you (mvabund package) and also by Yee (VGAM package).*

*However, distance-based methods have grown to incorporate spatial
dependency through the development of MEM and AEM techniques, which model
symmetric and asymmetric spatial relationships and can be included in the
explanatory side of the analysis.*

*Reading the multivariate GLM papers, however, I have not find exactly how
to control or include spatial autocorrelation. I am thinking of including
MEM and perhaps AEM variables simply as co-variables added to the
explanatory environmental variables in the multivariate GLM.*

*Is this a step I will regret later on? Is this ok?*

*A second quick wondering: common GLM analyzes are carried out as a series
of nested models  in which we exclude variables from an initial full model
based on anovas/AIC. I suppose this is also true for multivariate GLM. Is
it? Can I compare successive models using the same approach used in common
GLM?*

*Thanks in advance for any thoughts,*

*All the best,*

*Alexandre*

Replies **************************

*David Warton*

*Hi Alex,*

*Thanks for the e-mail, sounds like interesting stuff!*

*Yes you could as you say use the MEM and AEM techniques with manyglm,
while this is not the best of approaches for handling spatial data, it is
the simplest and currently the best one given the current lack of code for
an alternative.*

*And yes you could use an AIC approach for model selection.*

*****

*Hi,*

*the only thing i am aware of is the spatial autocorrection function
available in the nlme package:*

*for example:*

*null.model <- lme(fixed = A~B, data = data, random = ~ 1 | dummy,
method="ML")*

*cor.model <- update(null.model, correlation = corExp(form = ~ x + y),
method = "ML")*

*argument "correlation" accepts several forms of spatial models based on
variogram (here exponential based on xy coordinates). One can extract model
goodness with extract.aic() or just summary().*

*However, this is univariate glm (but can be extended to interaction) and
as far as i was told these procedures only exist for gaussian
distributions, not for poisson/NB, which are better for species data most
of the time.*

*I was looking for the same, but in the end i went back to RDA with dbMEMs
and used the aforementioned procedure*

*only for highly correlated univariate pairs in the dataset.*

* Please let me know, if you are more successful.*

*****

*Hi Alexandre,*

*Not sure what the best solution is, but a few hacker ideas come to mind.
First, you could create a spatially lagged variable from scratch.  This
would be created by deciding on a neighborhood size, say first order
neighbors, and then creating a variable that was the average response (Y)
value for the first order neighbors.  Neighborhood size could be
guestimated by looking at residual maps.  This is similar to what happens
in simultaneous autoregressive (SAR) lagged models. Then this lagged
variable could be a fixed covariate in your model.  You could test
residuals from the lagged model to see if this removed your spatial
autocorrelation.*

*Since you mentioned a GAM approach, you could also do a spatial GAM, where
Lat and Long variables are specified as smooth covariates with lots of
knots to account for short range spatial structure. Again, you could test
your residuals to see if this removed your spatial autocorrelation.*

*If you are comfortable with Bayesian modeling, Banerjee et al. (2015,
‘Hierarchical modeling and analysis for spatial data’) have a chapter on
multivariate spatial modeling, with a brief mention of generalized linear
models.*

*Some food for thought.*

*****

*Alexander,*

*Any chance you might include spatial dependency (however you may choose to
do it) as a random effect in a mixed-model structure?  This way you can
either run the model with the spatial dependency to test this explicitly or
remove this effect from the model structure.*

*And yes, you can use AIC to rank multivariate models.  *

*Just a quick note.*

*****

*Furthermore I received the suggestion to read the following papers:*

*Spatial factor analysis: a new tool for estimating joint **species
distributions and correlations in species range*

*James T. Thorson1*, Mark D. Scheuerell2, Andrew O. Shelton3, Kevin E.
See4, Hans J. Skaug5*

*and Kasper Kristensen. Methods in Ecology and Evolution 2015*

*Geostatistical delta-generalized linear mixed models improve **precision
for estimated abundance indices for West Coast*

*groundfishes. **James T. Thorson1*, Andrew O. Shelton2, Eric J. Ward2, and
Hans J. Skaug. **ICES Journal of Marine Science; doi:10.1093/icesjms/fsu243*

**

*The importance of spatial models for estimating the strength of **density
dependence.**JAMES T. THORSON,1,6 HANS J. SKAUG,2 KASPER KRISTENSEN,3
ANDREW O., HELTON,4 ERIC J.WARD,4 JOHN H. HARMS,1 **AND JAMES A.
BENANTE. **Ecology,
96(5), 2015, pp. 1202–1212. *

-- 
Dr. Alexandre F. Souza
Professor Adjunto III
Universidade Federal do Rio Grande do Norte
CB, Departamento de Ecologia
Campus Universitário - Lagoa Nova
59072-970 - Natal, RN - Brasil
lattes: lattes.cnpq.br/7844758818522706
http://www.docente.ufrn.br/alexsouza

	[[alternative HTML version deleted]]