[R-sig-Geo] Comparing abundances at fixed locations in space - Syrjala test

Mon Feb 11 11:46:37 CET 2008

Jean-Olivier Irisson wrote:

> Thank you for the pointer. The vignette of geoRglm seems promising, 
> though much is about prediction from a given model while I am most 
> interested in which terms are in the model, i.e. which variables have a 
> notable influence on the repartition of the organisms. My scenario seems 
> simpler than those presented however, since the data are standardized by 
> the sampling effort, meaning that the same Poisson law applies to all 
> points.

  I think you still need to fit a model, and then you can test how 
useful your covariates are with standard techniques.

> A continuous variable than would represent the spatiality in this 
> dataset could simply be the distance from the lower-left corner of the 
> sampling grid for example, or the distance from the island around which 
> the sampling grid is designed (such a distance would have a biological 
> meaning since we expect the abundances to be inversely proportional to 
> it). Is that something that could fit your definition of a 
> "smoothly-varying spatial error term" or am I completely mistaken?

  Think about fitting a straight line through some points. You find the 
line that best fits your points. Then you look at the residual 
differences between the line and your points. All the usual linear model 
theory about predictions and significance depends on those residuals 
being uncorrelated and independent. If you are fitting a straight line 
to a curve then that won't be true, and if you then say something about 
your straight line based on the linear model theory you'll be wrong.

  Now, you could fit a non-spatial generalised linear model to your data 
using glm() in R and then map the residuals. If the residual map shows 
structure, then there's something else going on that your model hasn't 
accounted for. Perhaps there is an obvious trend due to a covariate 
you've not included, such as elevation above sea level. You could then 
add this to your model. If the residual surface looks like random noise 
then you can use standard linear model theory to make conclusions about 
your covariate parameters.

  If the residual surface doesn't look like random noise then that's 
when you get into geoRglm functions which (I think) fit a GLM where the 
error surface (that's your residuals) is defined by a gaussian random 
field with a fitted covariance structure. Once that's done, the geoRglm 
code will tell you about your covariate parameter significance (I think! 
It's been a while since I've used it. Maybe Paulo and Ole can expand on 
this).

  So what I'd do is:

  * fit a simple GLM using glm.
  * Look at parameter estimates and significance.
  * Draw a map of residuals.
  * Then worry about spatial correlation.

  Oh, I'd also, if I were you, try and find a local statistician expert!

Barry