[R-sig-Geo] Create distance 'neighborhood' (zone of indifference) when clustering binary data

Wed Nov 23 18:33:41 CET 2011

On Wed, 23 Nov 2011, Alberto Gallano wrote:

> Hi Roger,
>
> thanks for your reply. The data I posted is only a subset of my real data,
> in which there are some points less than 100 meters apart.
>
> To clarify what I want:
>
> 1) I would like to create neighbourhoods of 100 meters within which all
> points are neighbors. I think my code will do that with the real dataset if
> I set the upper bounds of dnearneigh to 100.
>
> 2) I want points outside of this 100 meter neighborhood to still be
> neighbors, *but* with decreasing weight by distance. I believe this is
> referred to as inverse distance weighting. I may want to accentuate this by
> using squared inverse distance weighting.
>
> The question is how can I accomplish task 2? Here is my attempt:

Try going out to a sparse matrix with both, then back again:

library(spdep)
set.seed(1)
coords <- matrix(runif(1000, 0, 2000), ncol=2)
nb1 <- dnearneigh(coords, 0, 100)
nb1
k1 <- knn2nb(knearneigh(coords, k=1))
maxD <- max(unlist(nbdists(k1, coords)))
nb2 <- dnearneigh(coords, 100, maxD)
set.ZeroPolicyOption(TRUE)
lw1 <- nb2listw(nb1, style="B")
mat1 <- as(as_dgRMatrix_listw(lw1), "CsparseMatrix")
dnb2 <- nbdists(nb2, coords)
idw2 <- lapply(dnb2, function(x) 1/(x-100))
# to avoid an abrupt drop, I suggest subracting 100m
all(unlist(sapply(idw2, function(x) is.finite(x))))
lw2 <- nb2listw(nb2, glist=idw2, style="B")
mat2 <- as(as_dgRMatrix_listw(lw2), "CsparseMatrix")
mat12 <- mat1 + mat2
image(mat1)
image(mat2)
image(mat12)
summary(rowSums(mat12))
summary(colSums(mat12))
lw12 <- mat2listw(mat12, style="B")
lw12
table(card(lw12$neighbours))

Hope this helps,

Roger

>
> coords <- cbind(dat$x, dat$y)
> k1 <- knn2nb(knearneigh(coords))
> maxD <- max(unlist(nbdists(k1, coords)))
> datnb <- dnearneigh(coords, 0, maxD)
>
> # general weights - inverse distance squared
> dlist <- nbdists(datnb, coords)
> idlist <- lapply(dlist, function(x) (1/x)^2)
>
> datlistw.id2 <- nb2listw(datnb, glist=idlist, style="B", zero.policy=TRUE)
>
> joincount.test(as.factor(dat$present), datlistw.id2, zero.policy=TRUE,
>  alternative="greater", spChk=NULL, adjust.n=TRUE)
>
>
> Does this make sense? It seems so to me. (There is a warning given, but I
> think that is because my example has so few points).
>
> thank you,
>
> Alberto
>
>
>
> On Mon, Nov 21, 2011 at 3:29 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>
>> On Sun, 20 Nov 2011, Alberto Gallano wrote:
>>
>>  Hi, i'm trying to do join count analysis on binary data, but i'd like to
>>> create neighborhood's of 'indifference' of 100 meters around each spacial
>>> point.
>>>
>>> I've been using dnearneigh to create a neighbor list and knearneigh to
>>> calculate the maximum of the nearest neighbor distances for the upper
>>> bound
>>> argument to dnearneigh (so that no observations become islands).
>>>
>>> My question is, how do I create neighborhoods based on distance? Should I
>>> just input a number into the upper bound argument of dnearneigh, without
>>> first using knearneigh? If so, what number would correspond to 100 meters
>>> (Euclidean)? (btw, the coordinates are already projected).
>>>
>>
>> If you mean that you define neighbours as points j within 100m of point i,
>> then:
>>
>> datnb <- dnearneigh(coords, 0, 100)
>>
>> will do this if your coordinates are measured in metres. This doesn't work
>> for your example, because the closest points are almost 1200m apart, so no
>> point has any neighbours for this definition. However, your mentioning
>> neighborhoods of 'indifference' makes me uncertain that this is what you
>> mean. Do you mean placing a buffer around each point before finding
>> neighbours?
>>
>> Roger
>>
>>
>>> Here is a small subsample of my data and analysis. Thanks,
>>>
>>> Alberto
>>>
>>>
>>>
>>> # ==========================
>>> # data
>>> dat <- structure(list(present = c(0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L,
>>> 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L), x = c(332940L, 316301L,
>>> 312714L, 306008L, 312248L, 329276L, 329663L, 341535L, 314761L,
>>> 332898L, 332957L, 328332L, 312462L, 330063L, 317808L, 336216L,
>>> 333763L, 315049L, 333855L, 324406L), y = c(4305226L, 4303010L,
>>> 4316685L, 4309006L, 4319255L, 4311208L, 4316837L, 4306055L, 4301051L,
>>> 4300625L, 4330342L, 4303420L, 4308292L, 4307181L, 4292904L, 4304336L,
>>> 4313750L, 4297998L, 4314941L, 4315051L)), .Names = c("present",
>>> "x", "y"), class = "data.frame", row.names = c(1L, 2L, 3L, 4L, 5L, 6L,
>>> 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L))
>>>
>>>
>>> library(spdep)
>>> coords <- cbind(dat$x, dat$y)
>>> k1 <- knn2nb(knearneigh(coords))
>>> maxD <- max(unlist(nbdists(k1, coords))) # upper bound distance btw
>>> neighbors
>>> datnb <- dnearneigh(coords, 0, maxD)
>>> summary(datnb)
>>> print(is.symmetric.nb(datnb))
>>>
>>> datlistw <- nb2listw(datnb, glist=NULL, style="B", zero.policy=TRUE)
>>>
>>> # join count
>>> joincount.test(as.factor(dat$**present), datlistw, zero.policy=TRUE,
>>>   alternative="greater", spChk=NULL, adjust.n=TRUE)
>>> # =========================
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________**_________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/**listinfo/r-sig-geo<https://stat.ethz.ch/mailman/listinfo/r-sig-geo>
>>>
>>>
>> --
>> Roger Bivand
>> Department of Economics, NHH Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; fax +47 55 95 95 43
>> e-mail: Roger.Bivand at nhh.no
>>
>>
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no