[R-sig-Geo] Create distance 'neighborhood' (zone of indifference) when clustering binary data

Roger Bivand Roger.Bivand at nhh.no
Mon Nov 28 06:19:49 CET 2011


On Sun, 27 Nov 2011, Alberto Gallano wrote:

> Hi Roger,
>
> thanks so much for your suggestion. I am running the code on a large
> dataset and have been reading up on "sparse matrices", however, I haven't
> come across anything in the R or GIS literature that explains in simple
> terms how "going out to a sparse matrix with both, then back again" is
> achieving the goals I set.

Please do read the code. There is an aggregate method for nb objects in 
spdep, but no aggregate method for listw objects. Consequently, to 
aggregate them, I converted each to a sparse matrix format, added, and 
then converted back to listw format. My mistake was to think that trimming 
the >100m neighbours back to 0 would work - it didn't always do so. Again, 
read the code very carefully, this isn't about words, it's about the 
representation of objects in code.

>
> Could you please point me to a reference that has employed a similar
> approach, or perhaps explain yourself how doing this creates a region
> outside of the 'core' neighborhood in which points gradually have less
> influence with distance? Thanks.
>

No, no idea, your problem.

Roger

> Alberto
>
>
> On Wed, Nov 23, 2011 at 12:55 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>
>> On Wed, 23 Nov 2011, Roger Bivand wrote:
>>
>>  On Wed, 23 Nov 2011, Alberto Gallano wrote:
>>>
>>>  Hi Roger,
>>>>
>>>> thanks for your reply. The data I posted is only a subset of my real
>>>> data,
>>>> in which there are some points less than 100 meters apart.
>>>>
>>>> To clarify what I want:
>>>>
>>>> 1) I would like to create neighbourhoods of 100 meters within which all
>>>> points are neighbors. I think my code will do that with the real dataset
>>>> if
>>>> I set the upper bounds of dnearneigh to 100.
>>>>
>>>> 2) I want points outside of this 100 meter neighborhood to still be
>>>> neighbors, *but* with decreasing weight by distance. I believe this is
>>>> referred to as inverse distance weighting. I may want to accentuate this
>>>> by
>>>> using squared inverse distance weighting.
>>>>
>>>> The question is how can I accomplish task 2? Here is my attempt:
>>>>
>>>
>>> Try going out to a sparse matrix with both, then back again:
>>>
>>> library(spdep)
>>> set.seed(1)
>>> coords <- matrix(runif(1000, 0, 2000), ncol=2)
>>> nb1 <- dnearneigh(coords, 0, 100)
>>> nb1
>>> k1 <- knn2nb(knearneigh(coords, k=1))
>>> maxD <- max(unlist(nbdists(k1, coords)))
>>> nb2 <- dnearneigh(coords, 100, maxD)
>>> set.ZeroPolicyOption(TRUE)
>>> lw1 <- nb2listw(nb1, style="B")
>>> mat1 <- as(as_dgRMatrix_listw(lw1), "CsparseMatrix")
>>> dnb2 <- nbdists(nb2, coords)
>>> idw2 <- lapply(dnb2, function(x) 1/(x-100))
>>> # to avoid an abrupt drop, I suggest subracting 100m
>>>
>>
>> This wasn't a good idea, as points with (x-100) < 1 end up with weights >
>> 1. It would need trapping to reduce them to 1:
>>
>> idw2 <- lapply(dnb2, function(x) {x100 <- x-100; x100 <- ifelse (x100 < 1,
>> 1, x100); 1/x100})
>>
>> Roger
>>
>>
>>  all(unlist(sapply(idw2, function(x) is.finite(x))))
>>> lw2 <- nb2listw(nb2, glist=idw2, style="B")
>>> mat2 <- as(as_dgRMatrix_listw(lw2), "CsparseMatrix")
>>> mat12 <- mat1 + mat2
>>> image(mat1)
>>> image(mat2)
>>> image(mat12)
>>> summary(rowSums(mat12))
>>> summary(colSums(mat12))
>>> lw12 <- mat2listw(mat12, style="B")
>>> lw12
>>> table(card(lw12$neighbours))
>>>
>>> Hope this helps,
>>>
>>> Roger
>>>
>>>
>>>> coords <- cbind(dat$x, dat$y)
>>>> k1 <- knn2nb(knearneigh(coords))
>>>> maxD <- max(unlist(nbdists(k1, coords)))
>>>> datnb <- dnearneigh(coords, 0, maxD)
>>>>
>>>> # general weights - inverse distance squared
>>>> dlist <- nbdists(datnb, coords)
>>>> idlist <- lapply(dlist, function(x) (1/x)^2)
>>>>
>>>> datlistw.id2 <- nb2listw(datnb, glist=idlist, style="B",
>>>> zero.policy=TRUE)
>>>>
>>>> joincount.test(as.factor(dat$**present), datlistw.id2, zero.policy=TRUE,
>>>>  alternative="greater", spChk=NULL, adjust.n=TRUE)
>>>>
>>>>
>>>> Does this make sense? It seems so to me. (There is a warning given, but I
>>>> think that is because my example has so few points).
>>>>
>>>> thank you,
>>>>
>>>> Alberto
>>>>
>>>>
>>>>
>>>> On Mon, Nov 21, 2011 at 3:29 AM, Roger Bivand <Roger.Bivand at nhh.no>
>>>> wrote:
>>>>
>>>>  On Sun, 20 Nov 2011, Alberto Gallano wrote:
>>>>>
>>>>>  Hi, i'm trying to do join count analysis on binary data, but i'd like
>>>>> to
>>>>>
>>>>>> create neighborhood's of 'indifference' of 100 meters around each
>>>>>> spacial
>>>>>> point.
>>>>>>
>>>>>> I've been using dnearneigh to create a neighbor list and knearneigh to
>>>>>> calculate the maximum of the nearest neighbor distances for the upper
>>>>>> bound
>>>>>> argument to dnearneigh (so that no observations become islands).
>>>>>>
>>>>>> My question is, how do I create neighborhoods based on distance?
>>>>>> Should I
>>>>>> just input a number into the upper bound argument of dnearneigh,
>>>>>> without
>>>>>> first using knearneigh? If so, what number would correspond to 100
>>>>>> meters
>>>>>> (Euclidean)? (btw, the coordinates are already projected).
>>>>>>
>>>>>>
>>>>> If you mean that you define neighbours as points j within 100m of point
>>>>> i,
>>>>> then:
>>>>>
>>>>> datnb <- dnearneigh(coords, 0, 100)
>>>>>
>>>>> will do this if your coordinates are measured in metres. This doesn't
>>>>> work
>>>>> for your example, because the closest points are almost 1200m apart, so
>>>>> no
>>>>> point has any neighbours for this definition. However, your mentioning
>>>>> neighborhoods of 'indifference' makes me uncertain that this is what you
>>>>> mean. Do you mean placing a buffer around each point before finding
>>>>> neighbours?
>>>>>
>>>>> Roger
>>>>>
>>>>>
>>>>>  Here is a small subsample of my data and analysis. Thanks,
>>>>>>
>>>>>> Alberto
>>>>>>
>>>>>>
>>>>>>
>>>>>> # ==========================
>>>>>> # data
>>>>>> dat <- structure(list(present = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L,
>>>>>> 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L), x = c(332940L, 316301L,
>>>>>> 312714L, 306008L, 312248L, 329276L, 329663L, 341535L, 314761L,
>>>>>> 332898L, 332957L, 328332L, 312462L, 330063L, 317808L, 336216L,
>>>>>> 333763L, 315049L, 333855L, 324406L), y = c(4305226L, 4303010L,
>>>>>> 4316685L, 4309006L, 4319255L, 4311208L, 4316837L, 4306055L, 4301051L,
>>>>>> 4300625L, 4330342L, 4303420L, 4308292L, 4307181L, 4292904L, 4304336L,
>>>>>> 4313750L, 4297998L, 4314941L, 4315051L)), .Names = c("present",
>>>>>> "x", "y"), class = "data.frame", row.names = c(1L, 2L, 3L, 4L, 5L, 6L,
>>>>>> 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L))
>>>>>>
>>>>>>
>>>>>> library(spdep)
>>>>>> coords <- cbind(dat$x, dat$y)
>>>>>> k1 <- knn2nb(knearneigh(coords))
>>>>>> maxD <- max(unlist(nbdists(k1, coords))) # upper bound distance btw
>>>>>> neighbors
>>>>>> datnb <- dnearneigh(coords, 0, maxD)
>>>>>> summary(datnb)
>>>>>> print(is.symmetric.nb(datnb))
>>>>>>
>>>>>> datlistw <- nb2listw(datnb, glist=NULL, style="B", zero.policy=TRUE)
>>>>>>
>>>>>> # join count
>>>>>> joincount.test(as.factor(dat$****present), datlistw, zero.policy=TRUE,
>>>>>>  alternative="greater", spChk=NULL, adjust.n=TRUE)
>>>>>> # =========================
>>>>>>
>>>>>>       [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________****_________________
>>>>>> R-sig-Geo mailing list
>>>>>> R-sig-Geo at r-project.org
>>>>>> https://stat.ethz.ch/mailman/****listinfo/r-sig-geo<https://stat.ethz.ch/mailman/**listinfo/r-sig-geo>
>>>>>> <https://**stat.ethz.ch/mailman/listinfo/**r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>  --
>>>>> Roger Bivand
>>>>> Department of Economics, NHH Norwegian School of Economics,
>>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>>> voice: +47 55 95 93 55; fax +47 55 95 95 43
>>>>> e-mail: Roger.Bivand at nhh.no
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>> --
>> Roger Bivand
>> Department of Economics, NHH Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>>
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list