[R-sig-Geo] Implementing and interpreting join count analysis with spdep

Justin Schuetz jschuetz at Gmri.org
Sat Feb 11 17:47:07 CET 2017


Roger,


Thank you. This helps and gives me some more things to think about!


Cheers,


Justin

________________________________
From: Roger Bivand <Roger.Bivand at nhh.no>
Sent: Thursday, February 9, 2017 3:47:30 AM
To: Justin Schuetz
Cc: r-sig-geo at r-project.org
Subject: Re: [R-sig-Geo] Implementing and interpreting join count analysis with spdep

On Tue, 7 Feb 2017, Justin Schuetz wrote:

> Roger,
>
>
> Thanks for the leads. Is the general concern about including large
> numbers of neighbors that there may not really be any relationship
> between "very" distant points and that Type II errors will become more
> likely in a hypothesis testing framework?

No, two issues here.

One is that the data generation process involves not (I - \rho W) but
either (I - \rho W)^{-1} (CAR) or (I - \rho W)^{-1} (I - \rho W^T)^{-1}
(SAR), so is dense anyway. Under certain assumptions, (I - \rho W)^{-1} is
\sum_{i=0}^{Inf} (\rho W)^i, so the sum of a power series in the
autoregressive coefficient \rho and the spatial weights. So if W are only
contiguous neighbours, W %*% W are neighbours of neighbours, and so on. So
including all neighbours in W, even with inverse distance weighting,
"double-guesses" the form of the relationship. Most of these measures came
into being anyway for areal support rather than point support - the name
"join-count" tells us this.

The other is that the tests only make sense if we are confident that the
join counts are not being influenced by our partitioning of the surface
(or choice of points), nor by omitted variables. It could be that the join
counts are more related to elevation or temperature (say) than to being
neighbours - think of this as residuals from a regression (multinomial
here). Which join counts would we expect to see after taking into account
what we know about the drivers of the multinomial response?

Roger

>
>
> Justin
>
>
> ________________________________
> From: Roger Bivand <Roger.Bivand at nhh.no>
> Sent: Tuesday, February 7, 2017 12:19 PM
> To: Justin Schuetz
> Cc: r-sig-geo at r-project.org
> Subject: Re: [R-sig-Geo] Implementing and interpreting join count analysis with spdep
>
> On Tue, 7 Feb 2017, Justin Schuetz wrote:
>
>> List members,
>>
>> I am trying to assess whether points of the same color are spatially
>> autocorrelated. I have limited familiarity with spdep and join count
>> analyses and would like to confirm that I am implementing and
>> interpreting the analysis correctly. Below is a summary of the data,
>> code used in the analysis, results, and my quick interpretation. If
>> someone familiar with this type of analysis (or alternatives that are
>> better able to address the question) could offer feedback, I would
>> greatly appreciate it...particularly if you can help me understand how
>> best to choose the style parameter in nb2listw function.
>>
>> Many thanks,
>>
>> Justin
>>
>> ##### POINT DATA
>>
>> my.points
>> # A tibble: 50 × 4
>>   SVSPP JGS.UTM.X JGS.UTM.Y JGS.COLOR
>>   <int>     <dbl>     <dbl>     <int>
>> 1     13 -14272.27   4035288         3
>> 2     15 265382.54   4398047         3
>> 3     22 552678.02   4537646         3
>> 4     23 430904.66   4524648         3
>> 5     24 -43587.79   4050674         3
>> 6     25  67560.06   4190581         3
>> 7     26 326773.57   4488521         3
>> 8     27 578645.44   4735815         3
>> 9     28 533183.23   4722559         3
>> 10    33 368053.13   4566608         3
>> # ... with 40 more rows
>>
>> ##### CODE
>>
>> # make SpatialPoints object and specify projection
>>
>> coords <- my.points[, c("JGS.UTM.X", "JGS.UTM.Y")]
>> UTM19N <- "+proj=utm +zone=19 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs"
>> lat.lon <- SpatialPoints(coords, CRS(UTM19N))
>>
>> # generate spatial weights matrix (inverse distance weighting among all
>> points)
>
> At this point you are probably deciding the outcome of the tests -
> different weights will give different outcomes. The fewer neighbours the
> better is a good rule, otherwise the relationships may be heavily
> smoothed. See vignette("nb").
>
>>
>> my.k.neighbors <- knearneigh(lat.lon, k = length(lat.lon) - 1, longlat = FALSE)
>> my.neighbors <- knn2nb(my.k.neighbors)
>> my.distances <- nbdists(my.neighbors, lat.lon)
>> my.weights <- lapply(my.distances, function(x) 1/(x))
>> my.list <- nb2listw(my.neighbors, glist = my.weights, style = "S")
>>
>> # identify colors
>>
>> my.colors <- as.factor(my.points$JGS.COLOR)
>>
>> # assess whether similar colors are closer to each other than expected
>> # by chance given fixed point locations
>
> It is also possible to use joincount.multi() to compare all with all (here
> 1-1, 1-2, 1-3, 2-2, 2-3, 3-3), rather than 1 with not-1 and so on. Also
> look at alternative=, as your 3-3 looks close to significant negative
> autocorrelation (for your preferred weights).
>
> Roger
>
>>
>> joincount.mc(my.colors, my.list, nsim = 1000)
>>
>> ##### RESULTS
>>
>>        Monte-Carlo simulation of join-count statistic
>>
>> data:  my.colors
>> weights: my.list
>> number of simulations + 1: 1001
>>
>> Join-count statistic for 1 = 0.2037, rank of observed statistic = 987,
>> p-value = 0.01399
>> alternative hypothesis: greater
>> sample estimates:
>>    mean of simulation variance of simulation
>>           0.061423025            0.001758675
>>
>>
>>        Monte-Carlo simulation of join-count statistic
>>
>> data:  my.colors
>> weights: my.list
>> number of simulations + 1: 1001
>>
>> Join-count statistic for 2 = 0.076401, rank of observed statistic = 754,
>> p-value = 0.2468
>> alternative hypothesis: greater
>> sample estimates:
>>    mean of simulation variance of simulation
>>           0.060196719            0.001639431
>>
>>
>>        Monte-Carlo simulation of join-count statistic
>>
>> data:  my.colors
>> weights: my.list
>> number of simulations + 1: 1001
>>
>> Join-count statistic for 3 = 19.002, rank of observed statistic = 36,
>> p-value = 0.964
>> alternative hypothesis: greater
>> sample estimates:
>>    mean of simulation variance of simulation
>>            19.3165692              0.0451797
>>
>> ##### INTERPRETATION
>>
>> Points with color "1" are closer to each other than expected by chance, whereas there is
>> little evidence that points with colors "2" (or "3") are spatially autocorrelated.
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> --
> Roger Bivand
> Department of Economics, Norwegian School of Economics,
> Helleveien 30, N-5045 Bergen, Norway.
> voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
> http://orcid.org/0000-0003-2392-6140
> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> http://depsy.org/person/444584
>

--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
http://depsy.org/person/444584

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list