[R-sig-Geo] inference of local Gi* using permutations

Fri Mar 8 22:41:29 CET 2019

On Thu, 7 Mar 2019, Jay Wang wrote:

> Thank you for the detailed explanation. Sorry for missing the explanation
> of res. It is an n by nsim+1 matrix where the n columns are the permutation
> results using sample() and the last column is the actual z values of G_i*s.
> The following line of code was modified to do the permutations: (i in
> 1:nsim) res[i] <- moran(sample(x), listw, n, S0, zero.policy)$I. This could
> be wrong since the sample() does not fix the value of interest at i. Are
> there any functions in R can do this in a correct way? How do we calculate
> the p-values after we get the n by nsim matrix? Thank you!

This is an exercise based on the built-in Getis-Ord dataset:

library(spdep)
data(getisord, package="spData")
xycoords <- cbind(xyz$x, xyz$y)
nb30 <- dnearneigh(xycoords, 0, 30)
lw <- nb2listw(nb30, style="B")
G30 <- localG(xyz$val, lw, return_internals=TRUE)
G_obs <- attr(G30, "internals")[,1]
val <- xyz$val
loc_G <- function(x, lw) {
   x_star <- sum(x)
   lx <- lag.listw(lw, x)
   G <- lx/(x_star-c(x))
   G
}
nsim <- 499L
set.seed(1)
res <- matrix(0, nrow=length(val), ncol=nsim)
thisx <- numeric(length(val))
idx <- 1:length(val)
for (j in 1:nsim) {
   for (i in seq(along=val)) {
     thisx[i] <- val[i]
     thisx[!idx %in% i] <- sample(val[-i])
     res[i, j] <- loc_G(thisx, lw)[i]
   }
}
mns <- apply(res, 1, mean)
sds <- apply(res, 1, sd)
G_Zi <- (G_obs - mns)/sds
plot(density(G30), ylim=c(0, 0.25))
lines(density(G_Zi), lty=2, col=2)

As you can see, the analytical z-values of the match the permutations of G 
values closely. We're doing a permutation bootstrap on the G values, not 
the z-values of G. Next, try pysal (here old pysal < 2):

library(reticulate)
pysal <- import("pysal")
np <- import("numpy")
write.nb.gal(nb30, "nb30.gal")
w <- pysal$open("nb30.gal")$read()
y <- np$array(val)
w$transformation = "B"
lg_30_499 <- pysal$esda$getisord$G_Local(y, w,
   transform="B", permutations=nsim)
lines(density(lg_30_499$z_sim), lty=3, col=3)
all.equal(G_obs, c(lg_30_499$Gs))
all.equal(c(G30), c(lg_30_499$Zs))

The G_Local() function provides both analytical and permutation z-values; 
the G values and the analytical z-values agree exactly with spdep, the 
permutations are not using the same random number generator or seed, but 
are broadly similar. If you read pysal/esda/getisord.py, you'll see that 
some indexing is being done, avoiding the double loop. The code is fairly 
hard to read, but in optimising the permutation may cut corners; the 
self.z_sim is the same code as the simpler R code above.

So while for the local Moran, permutation is demonstrably misleading in 
the presence of mis-specification, for local G permutation simply 
reproduces the analytical z-values. If you use these to look up normal 
p-values, then adjust them for the false discovery rate, usually nothing 
is left significant anyway.

Roger

>
>
> On Thu, Mar 7, 2019 at 8:05 AM Roger Bivand <Roger.Bivand using nhh.no> wrote:
>
>> On Wed, 6 Mar 2019, Jay Wang wrote:
>>
>>> Hello,
>>>
>>> I am currently using the localG () in spdep package, I was wondering if
>> we
>>> can have a conditional permutation-based inference to get the P value for
>>> every Gi*. I saw that a Mote Carlo simulation is used in Moran.MC(), and
>> I
>>> borrowed the following codes from this function and tried to see if I can
>>> do a permutation for localG():
>>>
>>> pvals<-matrix(0, nrow = V, ncol = 1)
>>> for (i in 1:V){
>>>  rankresi<-rank(res[i, ])
>>>  xranki <- rankresi[length(res[i, ])]
>>>  diffi <- nsim - xranki
>>>  diffi <- ifelse(diffi > 0, diffi, 0)
>>>  pvali <- punif((diffi + 1)/(nsim + 1))
>>>  pvals[i,]<-pvali
>>> }
>>
>> Monte Carlo or Hope type tests or permutation bootstraps work like
>> analytical permutation in the global case, and are redundant for that
>> reason. Monte Carlo (conditional permutation) in the local case requires
>> that there is no global autocorrelation or other mis-specifications.
>>
>> Your code snippet is not an example, and I think is not how permutation is
>> actually done. In the local case in ArcGIS, GeoDa, etc., you fix the value
>> of interest at i, randomly re-assign all the values to the other
>> observations, and recompute the local indicator at i, doing this say 499
>> times. Then you move on to the next i, and repeat. In your snippet, we
>> cannot see where res is coming from. Is this an n by nsim matrix from nsim
>> runs? Might you be needing to compare the sample values with the observed
>> value? If you are outputting Z-values of G_i anyway, you may need to step
>> back to the actual G_i (here from spdep):
>>
>> attr(localG(..., return_internals=TRUE), "internals")[,1]
>>
>> Then
>>
>> mns <- apply(res, 1, mean)
>> sds <- apply(res, 1, sd)
>> permZ <- (obs_localG - mns)/sds
>>
>> are z values that can be considered as drawn from the normal. However, I
>> advise against inference unless the assumptions are met, and p-values must
>> be adjusted anyway.
>>
>> Roger
>>
>>>
>>> After running these codes with several different datasets, I found that
>> all
>>> the negative Gi*s have very high P values say 0.999 with 999
>> permutations,
>>> meaning that there are no significant cold spots. Where is the problem?
>> How
>>> can we do conditional permutation-based inference for localG() with R
>>> spdep? I understand the critics of permutation-based inference for local
>>> indicators, but I just want to explore this. Thank you!
>>>
>>> Best
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo using r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
>> https://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en