[R-sig-Geo] spatial autocorrelation, grids, and the ks-test

Mon Feb 8 10:38:32 CET 2010

On Fri, 5 Feb 2010, ingalapuma wrote:

> I would like to run the Moran's I on a .img file of forest composition
> change categories.

(Please, could anonymous questioners give some indication of their 
standing, for example in an informative signature? The R lists posting 
guide does say: "Some consider it good manners to include a concise 
signature specifying affiliation" - I am one of these some, for good 
reason. Helpers find it much easier to help when they have some idea of 
disciplinary background and motivation of the questioner, and a 
signature field can provide these.)

There are many questions here. Firstly, Moran's I is for continuous 
variables, not categories. For categories, use join count statistics. 
Secondly, the possible dependency in your data will be being driven both 
by any inherent (subatantive) dependency, and by the match (or mismatch) 
between the scale(s) of the phenomena and the resolution of your grid. In 
addition, there may be background variables that drive the dependency, 
like slope or elevation, so observed dependency even for correct scaling 
may not say anything about the effect of contiguity.

You can of course compute spatial autocorrelation of your data (treat them 
as having point support and use a distance criterion like the raster step 
for rook neighbours or the diagonal step for queen neighbours. It's just 
that the test will only reflect your resolution and possibly omitted 
variables. With raster data, especially with fine resolution, one 
typically has many "observations" of the same "natural" entity or object, 
leading to apparent spatial autocorrelation.

> I need to know if I can treat the categories as
> independent for performing the two sample Kolmogorov-Smirnov on relative
> fire frequency distributions of these forest composition change categories.
> Without considering spatial autocorrelation, the fire frequency
> distributions of the categories are significantly different according to the
> ks-test (which I was advised to use).  It is obvious that these categories
> are spatially clustered as are the areas of similar fire frequency, so my
> questions/statements are:

I think that you could consider checking your KS results against a better 
null, that is generating from distributions with the same level of spatial 
dependence (or rather not use KS, but fit a model and evaluate it in the 
context of dependence).

>
> 1.  Does it make sense to run the Moran's I on the thematic grid data with
> R? (Have tried to convert it to vector in both ArcGIS and Imagine and have
> run into errors...presumably because of size and complexity?)
>

In spdep use dnearneigh() to make a list of neighbours. If there is a lot 
of data in the raster, you will not gain any insight anyway - consider 
vigorous subsetting. Contrast the subsets.

> 2.  I have no idea how to do anything in R (besides the ks-test) and am
> completely new to it's spatial tools (although I am keen to learn and have
> loaded several of the packages/libraries).
>

Since the KS test maybe isn't a good idea anyway, perhaps starting afresh? 
You'd need to vary your raster resolution anyway to get any idea on how 
scaling works. This is going to take substantial time to do right. In your 
best case, join count tests are insignificant, and you can risk KS, but if 
you can see spatial patterning and have put the question on the table, you 
need an alternative approach.

Roger

> 3.  How would I incorporate the p-value results of the Moran's I (if I can
> get it to work) into the ks-test of my relative frequencies?
>
> I have ordered the spatial data with R book but I figured I'd cut to the
> chase and see if you all could help me get a jumpstart on this.
>
> Thanks ahead of time.
>
>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no