[R-sig-Geo] Spatial Autocorrelation for point data

Thu Aug 12 15:25:14 CEST 2010

Dear Community,

I hope that my question is not misplaced here, but I do not know where (and whom) to ask other than here. My problem concerns methodical issues as well as the search for the right R function.

For a quite some time I work with spatial data and I was now asked to test my data for spatial autocorrelation. The more I read on that topic, the more uncertain I am if this kind of analyses is really made for my kind of data. I work on plots distributed over a study area of not more than 30 x 30 km. These plots are point data in the sense of point coordinates. Two locations are at least 4 km apart but not evenly distributed over the area. For these plots I have data on species richness and habitat. So far I did all my analyses using vector data sets (in the form of shape files) and never used raster data. So far I have often been told simply to use Moran's I for my analyses of spatial autocorrelation because everybody else is using it. And hey, never touch a running system so why should we use something different. But I am unaware if this kind of analysis really works with data that are not organised in grid cells (i.e. raster data). I mean, it works and I get values but are these values reliable, when I use point data with no information in between? My Moran's I correlograms follow a zig zack pattern in my trials.

I will probably never come to the level that I fully understand the underlying mathematics behind the latest statistical methods but I hope that I at least come to a level that enables me to judge what method should be used for a particular kind of data and/or problem. For many analyses it has been stated that they are mainly for the analysis of global data or should be applied on larger spatial scales. So which kind of analysis is best for my small spatial data set and how can I get meaningful results for my analysis of spatial autocorrelation with R? Should I use Moran's I or Geary C or something completely different? Is it necessary to transform my data into raster data or do the test also work with point data? How many neighbours should I choose? (I tried 2 and 4 so far)

Cheers,

Nils