[R-sig-Geo] Replicating spatstat::pcfcross
Adrian Baddeley
adrian.baddeley at uwa.edu.au
Tue Jan 28 05:39:21 CET 2014
On 28/01/14 10:39, Noam Ross wrote:
> I have an application for which I need very fast calculation of the pair
> correlation function between two point types in a marked point pattern.
> I'm attempting to replicate results from spatstat::pcfcross. I wrote a
> quick-and-dirty version of this function to ensure that I understood it
Oh, I *so* wanted to channel Brian Ripley there for a moment %^]
In the line
> cp$wt = A / (W - abs(cp$dx))*(H - abs(cp$dy))
you need to put parentheses around the denominator,
because a/b*c is parsed as (a/b)*c. Because of this line, the weights are incorrect;
they are sometimes less than 1 (instead of being always > 1) which is
causing most of the negative bias.
The sharp drop in the estimate at r ~~ rmax is caused by the line
cp <- closepairs(pp, rmax)
which restricts the calculation to pairs of points with distance <= rmax.
Contributions to the kernel estimate of pcf(rmax) can occur
from interpoint distances slightly greater than rmax. In the call to closepairs()
you need to replace 'rmax' by 'rmax + kwide' where 'kwide' is the width
of the support of the kernel.
There are several other idiosyncrasies (things that will only
work if the window is a rectangle, there are only two possible marks, etc)
but no other major problems.
Thank you for communicating this question. It has spurred me to get on with
implementing a faster version of pcfcross in 'spatstat'.
Adrian Baddeley
----
On 28/01/14 10:39, Noam Ross wrote:
> I have an application for which I need very fast calculation of the pair
> correlation function between two point types in a marked point pattern.
>
> I'm attempting to replicate results from spatstat::pcfcross. I wrote a
> quick-and-dirty version of this function to ensure that I understood it
> properly before moving everything to C and optimizing for my particular
> application. However, I'm not quite getting the same results as with
> pcfcross.
>
> I note that, while pcfcross calculates the pcf for each mark type, then
> calculates the cross-pcf from those, I am calculating the cross-pcf
> directly from the distances between heterogeneous pairs. This seems like a
> more efficient approach when I am only interested in the latter. The
> resulting estimated pcf consistently lower than that from pcfcross at long
> distances, and sometimes very different from pcfcross near r=0. What could
> account for this difference? Is it expected given the different methods?
>
> I've included the function below and an example plotting the spatstat and
> my own version against each other.
>
> #set.seed(0)
> library(spatstat)
> pp = rmpoispp(lambda=c(100,100), win=c(0,1,0,1), types=c("S", "I"))
>
> pcfcross2 <- function(pp, rmax = 0.25, stoyan = 0.15) {
>
> #Generate distances between close pairs, limit only to heterogenous pairs
> cp = closepairs(pp, rmax)
> dups = duplicated(cp$d)
> cp = lapply(cp, function(z) z[!dups])
> Infected = as.integer(marks(pp) == "I")
> SIpairs = which(Infected[cp$i] + Infected[cp$j] == 1)
> cp = lapply(cp, function(z) z[SIpairs])
>
> #ppp statistics
> win = as.owin(pp)
> H = diff(win$yrange)
> W = diff(win$xrange)
> A = H*W
> lambda = sqrt(prod(intensity(pp))/A)
>
> #calculate translational edge weights
> cp$wt = A / (W - abs(cp$dx))*(H - abs(cp$dy))
> wtot = sum(cp$wt)
>
> #Density parameters and calculation
> h <- stoyan/sqrt(pp$n/A)
> bw <- h/sqrt(5)
> dens = density.default(cp$d, weights=cp$wt/wtot, bw = bw,
> kernel="epanechnikov", from=0, to=rmax,
> n = 513)
>
> #Scale to g(r)
> r = dens$x
> y = dens$y*wtot / (2*pi*r*lambda^2*A)
> y[which(y==Inf)] = NA
> return(data.frame(r=r, y=y))
> }
>
>
> pcf1 = pcfcross(pp, "S", "I", correction="translate")
> plot(pcf1)
> pcf2 = pcfcross2(pp)
> lines(pcf2, col="blue")
More information about the R-sig-Geo
mailing list