[R-sig-Geo] Very dissimilar results with Moran's I and EBI

Mon Mar 23 10:29:59 CET 2015

On Mon, 23 Mar 2015, Richard Asturia wrote:

> Dear list,
>
> I have used a colleague's data to compare geographical clustering using
> different ways of detecting clustering. Today, I found very dissimilar
> results for the statistic of Moran's I and of the EBI version of Moran's I
> (using spdep's EBImoran.mc), what I understand was not supposed to be the
> case. In fact, with EBI it was twice as big (1.23 against 0.69).
>
> The issue is that the EBI statistic was supposed to be similar to Moran's I
> in what regards the value of Z, while not necessarily in what regards the
> p-value.
>
> I would like to ask your advice on what seems to be going wrong. In order
> to give a reproducible example without having to make you download my
> shapefile, I wrote a code that creates precisely the same neighborhood
> structure of the map I am using.

Thanks for trying to provide an example. Unfortunately, you posted HTML, 
which makes it very hard to copy and paste from any email client. The 
correct way to move an nb object would be:

nb <- structure(list(21L, c(8L, 9L, 11L, 18L, 20L, 22L, 25L), c(10L,
14L, 26L), c(8L, 25L), c(6L, 12L, 15L, 23L, 26L), c(5L, 12L,
13L, 18L, 27L), c(13L, 16L, 18L, 24L), c(2L, 4L, 20L, 25L), c(2L,
11L, 16L, 17L, 20L, 21L), c(3L, 14L, 28L), c(2L, 9L, 16L, 18L
), c(5L, 6L, 13L, 23L), c(6L, 7L, 12L, 18L, 19L, 23L, 24L, 27L
), c(3L, 10L, 26L, 28L), c(5L, 26L), c(7L, 9L, 11L, 18L, 21L),
     c(9L, 20L, 21L), c(2L, 6L, 7L, 11L, 13L, 16L, 22L, 27L),
     c(13L, 23L, 24L, 26L), c(2L, 8L, 9L, 17L, 21L), c(1L, 9L,
     16L, 17L, 20L), c(2L, 18L, 25L), c(5L, 12L, 13L, 19L, 26L
     ), c(7L, 13L, 19L), c(2L, 4L, 8L, 22L), c(3L, 5L, 14L, 15L,
     19L, 23L), c(6L, 13L, 18L), c(10L, 14L)), class = "nb", region.id = 
c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
"25", "26", "27", "28"), call = NA, sym = TRUE)

constructed by using dput() - so using spaces permitting line breaks, and 
for the others:

cases <- c(5, 55, 6, 14, 22, 2, 8, 159, 158, 0, 15, 85, 21, 8, 3, 27,
1128, 18, 16, 101, 72, 14, 1, 55, 18, 6, 0, 4)

pop <- c(865, 4534, 9708, 1799, 4929, 2154, 4159, 10414, 10917, 1420,
1166, 26098, 8417, 3049, 1853, 2419, 49108, 2982, 4218, 6824,
6923, 1961, 1259, 13661, 2418, 3672, 811, 1375)

and

my.listW <- nb2listw(nb, style="W")

Could you please say why you expect the values of I to be similar? Is 
this claimed in:

Assunção RM, Reis EA 1999 A new proposal to adjust Moran's I for
population density. Statistics in Medicine 18, pp. 2147-2162

Running

summary(glm(cases ~ 1 + offset(log(pop)), family=quasipoisson))

suggests that the data are overdispersed, so the assumptions for empirical 
Bayes adjustment may not be met. For Negative Binomial alternatives, see 
the DCluster package.

Hope this helps,

Roger

PS. It would be helpful to know why you are asking, and very helpful to 
have an affiliation in your signature. At least you use a real name!

>
> Any advice will be very much appreciated.
>
> Thanks in advance,
>
> RA
>
>
> #THE CODE:
> #obviously loading the package:
> require(spdep)
>
> #this big messy piece of code recreates in matrix form the neighborhood
> structure:
> neighb.structure
> <-matrix(c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.142857142857143,0.142857142857143,0,0.142857142857143,0,0,0,0,0,0,0.142857142857143,0,0.142857142857143,0,0.142857142857143,0,0,0.142857142857143,0,0,0,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0.333333333333333,0,0,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0,0,0,0.5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5,0,0,0,0,0,0,0,0,0.2,0,0,0,0,0,0.2,0,0,0.2,0,0,0,0,0,0,0,0.2,0,0,0.2,0,0,0,0,0,0,0.2,0,0,0,0,0,0,0.2,0.2,0,0,0,0,0.2,0,0,0,0,0,0,0,0,0.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0.25,0,0,0.25,0,0.25,0,0,0,0,0,0.25,0,0,0,0,0,0.25,0,0.25,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.25,0,0,0,0,0.25,0,0,0,0,0.166666666666667,0,0,0,0,0,0,0,0,0.166666666666667,0,0,0,0,0.166666666666667,0.166666666666667,0,0,0.166666666666667,0.166666666666667,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0.25,0,0,0,0,0,0,0.25,0,0,0,0,0,0,0.25,0,0.25,0,0,0,0,0,0,0,0,0,!
> 0,0,0,0,0,0.25,0.25,0,0,0,0,0,0,0.25,0,0,0,0,0,0,0,0,0,0.25,0,0,0,0,0,0,0,0,0,0,0.125,0.125,0,0,0,0,0.125,0,0,0,0,0,0.125,0.125,0,0,0,0.125,0.125,0,0,0.125,0,0,0,0.25,0,0,0,0,0,0,0.25,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.25,0,0.25,0,0,0,0,0.5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5,0,0,0,0,0,0,0,0,0.2,0,0.2,0,0.2,0,0,0,0,0,0,0.2,0,0,0.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0,0,0,0,0.333333333333333,0.333333333333333,0,0,0,0,0,0,0,0,0.125,0,0,0,0.125,0.125,0,0,0,0.125,0,0.125,0,0,0.125,0,0,0,0,0,0.125,0,0,0,0,0.125,0,0,0,0,0,0,0,0,0,0,0,0,0,0.25,0,0,0,0,0,0,0,0,0,0.25,0.25,0,0.25,0,0,0,0.2,0,0,0,0,0,0.2,0.2,0,0,0,0,0,0,0,0.2,0,0,0,0.2,0,0,0,0,0,0,0,0.2,0,0,0,0,0,0,0,0.2,0,0,0,0,0,0,0.2,0.2,0,0,0.2,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0,0.2,0,0,0,0,0,0,0.2,0.2,0,0,0,0,0,0.2,0,0,0,0,0,0,0.2,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0.3333!
> 33333333333,0,0,0,0,0,0,0,0,0,0,0.25,0,0.25,0,0,0,0.25,0,0,0,0,0,0,0,0
> ,0,0,0,0,0,0.25,0,0,0,0,0,0,0,0,0.166666666666667,0,0.166666666666667,0,0,0,0,0,0,0,0,0.166666666666667,0.166666666666667,0,0,0,0.166666666666667,0,0,0,0.166666666666667,0,0,0,0,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0.333333333333333,0,0,0,0,0.333333333333333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.5,0,0,0,0.5,0,0,0,0,0,0,0,0,0,0,0,0,0,0),28,28)
>
> #recreating the absolute value of the cases of interest in each feature of
> the map:
> cases <-
> c(5,55,6,14,22,2,8,159,158,0,15,85,21,8,3,27,1128,18,16,101,72,14,1,55,18,6,0,4)
>
> #recreating  the absolute value of the population in each feature of the
> map:
> pop <-
> c(865,4534,9708,1799,4929,2154,4159,10414,10917,1420,1166,26098,8417,3049,1853,2419,49108,2982,4218,6824,6923,1961,1259,13661,2418,3672,811,1375)
>
> #recreating the rate values in each feature of the map:
> rates <- cases/pop
>
> #generating a listW:
> neighb.structure <- t(neighb.structure)
> row.names(neighb.structure) <-
> my.listW <- mat2listw(neighb.structure)
>
> #running the tests:
> moran.test(x=rates, listw=my.listW)
> moran.mc(x=rates, listw=my.listW, nsim=999)
> EBImoran.mc(n=cases, x=pop, listw=my.listW, nsim=999)
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no