[R-sig-Geo] Comparing distance among point pattern events

Thu Jan 9 15:35:19 CET 2020

Dear R-Sig-Geo Members,

I have the three hypothetical point process situation (A, B and C) and my question is: What point distribution (B or C) is more close to A?

For this problem, I make a simple example:

library(spatstat)
set.seed(2023)
A <- rpoispp(30) ## First event
B <- rpoispp(30) ## Second event
C <- rThomas(10,0.02,5) ## Third event with Thomas cluster process
plot(A, pch=16)
plot(B, col="red", add=T)
plot(C, col="blue", add=T)

First, I takesthe distances between pairs of events:

ABd<-crossdist(A, B)
ACd<-crossdist(A, C)

mean(ABd)
# 0.4846027
mean(ACd)
# 0.5848766

# test the hypothesis that ABd is equal to ACd courtesy of Sarah Goslee

nperm <- 999

permout <- data.frame(ABd = rep(NA, nperm), ACd = rep(NA, nperm))

# create framework for a random assignment of B and C to the existing points

BC <- superimpose(B, C)
B.len <- npoints(B)
C.len <- npoints(C)
B.sampvect <- c(rep(TRUE, B.len), rep(FALSE, C.len))

set.seed(2023)
for(i in seq_len(nperm)) {
     B.sampvect <- sample(B.sampvect)
     B.perm <- BC[B.sampvect]
     C.perm <- BC[!B.sampvect]

     permout[i, ] <- c(mean(crossdist(A, B.perm)), mean(crossdist(A, C.perm)))
}

boxplot(permout$ABd - permout$ACd)
points(1, mean(ABd) - mean(ACd), col="red")

table(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd))
#TRUE
# 999

sum(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd)) / nperm
# [1] 1

The difference between ACd and ABd is distinguishable from that obtained by a random resampling of B and C.
Then B (0.4846027) is more close to A, that C (0.5848766).

But, now I comparing the distance to mean nearest neighbour and minimum distance between each pair of types:

marks(A)<-as.factor("A")
marks(B)<-as.factor("B")
marks(C)<-as.factor("C")

# distance to nearest neighbour A to B
nnda <- nncross(A,B, by=marks(A,B))

# mean nearest neighbour distances
mean(nnda[,1])
#[1] 0.09847543

# distance to nearest neighbour A to C
nndb <- nncross(A,C, by=marks(A,C))

# mean nearest neighbour distances
mean(nndb[,1])
#[1] 0.151127

# test again the hypothesis that ABd is equal to ACd

nperm <- 999

permout <- data.frame(ABd = rep(NA, nperm), ACd = rep(NA, nperm))

# create framework for a random assignment of B and C to the existing points

BC <- superimpose(B, C)
B.len <- npoints(B)
C.len <- npoints(C)
B.sampvect <- c(rep(TRUE, B.len), rep(FALSE, C.len))

set.seed(2023)
for(i in seq_len(nperm)) {
     B.sampvect <- sample(B.sampvect)
     B.perm <- BC[B.sampvect]
     C.perm <- BC[!B.sampvect]
     ab<-nncross(A, B.perm)
     ac<-nncross(A, C.perm)

     permout[i, ] <- c(mean(ab[,1]), mean(ac[,1]))
}

boxplot(permout$ABd - permout$ACd)
points(1, mean(nnda[,1]) - mean(nndb[,1]), col="red")

table(abs(mean(nnda[,1]) - mean(nndb[,1])) >= abs(permout$ABd - permout$ACd))
#FALSE  TRUE
#   91   908

sum(abs(mean(nnda[,1]) - mean(nndb[,1])) >= abs(permout$ABd - permout$ACd)) / nperm
#[1] 0.9089089

Now, the same conclusion or the mean nearest neighbour distances of A to B (0.10887343) is smaller than A to C (0.151127),
but is not so clear for me, what is the better approach if a comparing crossdist() and nndist () results for a good answer to my question?

Any conceptual tips?

Thanks in advance,

-- 
Alexandre dos Santos
Geotechnologies and Spatial Statistics applied to Forest Entomology
Instituto Federal de Mato Grosso (IFMT) - Campus Caceres
Caixa Postal 244 (PO Box)
Avenida dos Ramires, s/n - Distrito Industrial
Caceres - MT - CEP 78.200-000 (ZIP code)
Phone: (+55) 65 99686-6970 / (+55) 65 3221-2674
Lattes CV: http://lattes.cnpq.br/1360403201088680
OrcID: orcid.org/0000-0001-8232-6722
ResearchGate: www.researchgate.net/profile/Alexandre_Santos10
Publons: https://publons.com/researcher/3085587/alexandre-dos-santos/
--

Em 22/11/2019 10:09, Sarah Goslee escreveu:
> Hi,
>
> Great question, and clear example.
>
> The first problem:
> ACd<-pairdist(A) instead of ACd <- pairdist(AC)
>
> BUT
>
> pairdist() is the wrong function: that calculates the mean distance
> between ALL points, A to A and C to C as well as A to C.
>
> You need crossdist() instead.
>
> The most flexible approach is to roll your own permutation test. That
> will work even if B and C are different sizes, etc. If you specify the
> problem more exactly, there are probably parametric tests, but I like
> permutation tests.
>
>
> library(spatstat)
> set.seed(2019)
> A <- rpoispp(100) ## First event
> B <- rpoispp(50) ## Second event
> C <- rpoispp(50) ## Third event
> plot(A, pch=16)
> plot(B, col="red", add=T)
> plot(C, col="blue", add=T)
>
> ABd<-crossdist(A, B)
> ACd<-crossdist(A, C)
>
> mean(ABd)
> # 0.5168865
> mean(ACd)
> # 0.5070118
>
>
> # test the hypothesis that ABd is equal to ACd
>
> nperm <- 999
>
> permout <- data.frame(ABd = rep(NA, nperm), ACd = rep(NA, nperm))
>
> # create framework for a random assignment of B and C to the existing points
>
> BC <- superimpose(B, C)
> B.len <- npoints(B)
> C.len <- npoints(C)
> B.sampvect <- c(rep(TRUE, B.len), rep(FALSE, C.len))
>
> set.seed(2019)
> for(i in seq_len(nperm)) {
>      B.sampvect <- sample(B.sampvect)
>      B.perm <- BC[B.sampvect]
>      C.perm <- BC[!B.sampvect]
>
>      permout[i, ] <- c(mean(crossdist(A, B.perm)), mean(crossdist(A, C.perm)))
> }
>
>
> boxplot(permout$ABd - permout$ACd)
> points(1, mean(ABd) - mean(ACd), col="red")
>
> table(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd))
> # FALSE  TRUE
> #  573   426
>
> sum(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd)) / nperm
> # 0.4264264
>
> The difference between ACd and ABd is indistinguishable from that
> obtained by a random resampling of B and C.
>
>
> Sarah
>
> On Fri, Nov 22, 2019 at 8:26 AM ASANTOS via R-sig-Geo
> <r-sig-geo using r-project.org> wrote:
>> Dear R-Sig-Geo Members,
>>
>> I have the hypothetical point process situation:
>>
>> library(spatstat)
>> set.seed(2019)
>> A <- rpoispp(100) ## First event
>> B <- rpoispp(50) ## Second event
>> C <- rpoispp(50) ## Third event
>> plot(A, pch=16)
>> plot(B, col="red", add=T)
>> plot(C, col="blue", add=T)
>>
>> I've like to know an adequate spatial approach for comparing if on
>> average the event B or C is more close to A. For this, I try to make:
>>
>> AB<-superimpose(A,B)
>> ABd<-pairdist(AB)
>> AC<-superimpose(A,C)
>> ACd<-pairdist(A)
>> mean(ABd)
>> #[1] 0.5112954
>> mean(ACd)
>> #[1] 0.5035042
>>
>> With this naive approach, I concluded that event C is more close of A
>> that B. This sounds enough for a final conclusion or more robust
>> analysis is possible?
>>
>> Thanks in advance,
>>
>> Alexandre
>>

	[[alternative HTML version deleted]]