[R-sig-Geo] Spatial analysis question

Wed Nov 4 05:10:58 CET 2009

Marcelo Tognelli <mtognelli at lab.cricyt.edu.ar> writes:

> I have probability maps of the distribution of 4 species of venomous snakes
> (raster files output from species distribution modeling software) and point
> locality data with information on snake bite events (most of them without
> the id of the species involved in the accident). I would like to run an
> analysis to see what species correlates best with snake bite events. 

> My idea is to generate a kernel density raster from the point event data and then do
> some kind of spatial correlation against the species distribution maps.

You don't need to smooth the point event data to obtain a correlation. The correlation between a 
point process and a random field is well-defined. In the 'spatstat' package, if X is a point pattern
and Z is a pixel image, then the sample correlation can be computed by

      Xp <- pixellate(X, W=as.owin(Z))

      cor(as.vector(as.matrix(Z)), as.vector(as.matrix(Xp)), use="pairwise.complete.obs")

However, the interpretation of correlations in a spatially-inhomogeneous dataset is beset with problems.

Probably the best way to analyse these data is to regard the snake bite events as a point process 'response'
and the species distribution data as the 'covariates' that may be used to explain the response. 
This is similar to many standard analyses in spatial epidemiology.

The very simplest statistical assumption would be that each species has a different intrinsic probability 
of biting (the probability that in a particular confrontation a particular snake will bite). 
In that case the intensity of snake bites (number of snake bites per unit area, in the survey period) 
would be proportional to p1 * a1 + p2 * a2 + p3 * a3 + p4 * a4 where p1, p2..p4 are the (unknown) bite probabilities of each species, and a1, a2...a4 are the (known) spatially-varying densities of abundance for the 4 species. Assuming snakes act independently of each other (!) the snake bites are a Poisson process with this intensity. 

Unfortunately this model cannot (yet) be fitted in 'spatstat' because it is linear and the current implementation
of 'ppm' is restricted to canonical (loglinear) models. The following hack would work.
        objfun <- function(beta, X, Z1, Z2, Z3, Z4) {
             p <- exp(beta)
             inten <- eval.im(p[1] * Z1 + p[2]* Z2 + p[3]*Z3 + p[4] * Z4)
             -sum(log(inten[X])) + summary(inten)$integral
        }
        op <- optim(rep(0,4), objfun, X=bites, Z1=species1, Z2=species2, Z3=species3, Z4=species4)
        p <- exp(op$par)

where 'bites' is your point pattern of snake bites, and species1, .., species4 are pixel images of the 
distributions of the 4 snakes. The result 'p' gives maximum likelihood estimates of the relative biting propensities for the 4 species (these are not normalised so only their relative values makes sense). 
By tweaking the call to 'eval.im' you can try different kinds of models.

There's an issue here about the 'opportunity' or 'exposure' to snake bite. If the density of people
(or visits by people) is not uniform across the study region then this could affect the observed
distribution of snake bites.

"Tomislav Hengl" <hengl at spatial-analyst.net> writes:

> Surprisingly, I could not find any function in the spatstat package (or splancs package) that
> specifically derives cross-correlations between multiple point processes.

The Mark Connection Function 'markconnect' does exactly this, e.g.
              data(lansing)
              plot(alltypes(lansing, markconnect))
(But this is for correlations between different points, not between points and spatial variables.)

> # fit stationary marked Poisson process with different intensity for each species:
> lansing.ppm <- ppm(lansing, ~marks, Poisson())
> summary(lansing.ppm)
> ...but this does not say anything about which species are most correlated 
> (and which are negatively correlated).

Of course not - the fitted model assumes they are NOT correlated. 

Adrian Baddeley