[R-sig-Geo] dnearneigh() from spdep: Points with the exact same location are not considered neighbours.

Maël Le Noc mael.lenoc at laposte.net
Wed Apr 12 21:15:59 CEST 2017


Thank you Roger,
Your patched version now works as expected!

And thank you for your suggestions, I am currently looking into all of that.
Best
Maël



On 12/04/2017 09:00, Roger Bivand wrote:
> On Wed, 12 Apr 2017, Maël Le Noc wrote:
>
>> Dear Roger,
>> Thank you for your answer, (And sorry for the HTML posting).
>>
>> The issue persists if I specify "GE" for the lower bound, but only when
>> the parameter latlong is set to TRUE (see example below).
>
> Thanks, very useful. The Great Circle distance measure returned
> NotANumber for zero distance, because of an unprotected division by
> zero. I've committed a patched source version to R-Forge. Look on
>
> https://r-forge.r-project.org/R/?group_id=182
>
> later today for a version with today's date and Rev: 693 - should show
> up mid to late evening CEST. Please say whether this performs as
> expected.
>
>>
>>
>> Regarding the nature of my data, it is a series of record of Jews
>> arrested during the Holocaust in Italy. Those are point data, and some
>> people have been arrested at the same place and at the same time (hence
>> my problem). I am trying to assess spatial autocorrelation for a binary
>> attribute (whether they survived the Holocaust or not), and I plan to
>> use a Join-count method, for which I need a spatial weight matrix. Is
>> using Join-count on such a dataset wrong ?
>
> Join-count should be OK, but if you have covariates you could try to
> remove the mean model first and only then see whether there is a
> spatially structured random effect, for example with hglm, R2BayesX,
> INLA, or similar. For hglm see for example:
>
> https://journal.r-project.org/archive/2015/RJ-2015-017/index.html
>
> The data you most likely do not have (addresses with residents at risk
> of arrest but not arrested) would also help, giving you a risk of
> arrest measure by address. There is also a spatial probit literature
> that might be relevant; if you have timestamps, you will likely find
> that operational factors play in, with arrests in a small area at the
> same time.
>
> Hope this helps,
>
> Roger
>
>>
>> Best
>>
>>
>>
>> Code:
>>
>> library(data.table)
>> library(spdep)
>> pointstable <- data.table(XCoord=c(13.667029,13.667029,13.667028),
>> YCoord=c(42.772396,42.772396,42.772396))
>> print(pointstable)
>> coords <-cbind(pointstable$XCoord, pointstable$YCoord)
>> nbLocal<- dnearneigh(coords, d1=0, d2=25, longlat = TRUE, bound =
>> c("GE", "LE"))
>> summary(nbLocal)
>> nbLocal<- dnearneigh(coords, d1=0, d2=25, longlat = FALSE, bound =
>> c("GE", "LE"))
>> summary(nbLocal)
>>
>>
>> Output:
>>> print(pointstable)
>>     XCoord  YCoord
>> 1: 13.66703 42.7724
>> 2: 13.66703 42.7724
>> 3: 13.66703 42.7724
>>
>>> nbLocal<- dnearneigh(coords, d1=0, d2=25, longlat = TRUE, bound =
>> c("GE", "LE"))
>>> summary(nbLocal)
>> Neighbour list object:
>> Number of regions: 3
>> Number of nonzero links: 4
>> Percentage nonzero weights: 44.44444
>> Average number of links: 1.333333
>> Link number distribution:
>>
>> 1 2
>> 2 1
>> 2 least connected regions:
>> 1 2 with 1 link
>> 1 most connected region:
>> 3 with 2 links
>>
>>> nbLocal<- dnearneigh(coords, d1=0, d2=25, longlat = FALSE, bound =
>> c("GE", "LE"))
>>> summary(nbLocal)
>> Neighbour list object:
>> Number of regions: 3
>> Number of nonzero links: 6
>> Percentage nonzero weights: 66.66667
>> Average number of links: 2
>> Link number distribution:
>>
>> 2
>> 3
>> 3 least connected regions:
>> 1 2 3 with 2 links
>> 3 most connected regions:
>> 1 2 3 with 2 links
>>
>>
>>
>> On 12/04/2017 02:27, Roger Bivand wrote:
>>> Do not post HTML-mail, only plain text. Your example is not
>>> reproducible
>>> because you used HTML-mail.
>>>
>>> Please read the help file, the bounds are described as being between
>>> lower (greater than) and upper (less than or equal to) bounds. Since
>>> the
>>> distance between identical points is strictly zero, they are not
>>> neighbours because the distance must be > d1 and <= d2. If d1 is <
>>> 0, it
>>> is reset to 0, as it is assumed that a negative lower bound is a user
>>> error (and it would break the underlying compiled code).
>>>
>>> In any case, no reasonable cross-sectional spatial process has
>>> duplicated point (nugget) observations in situations in which spatial
>>> weights would be used (spatio-temporal panels will have, but then time
>>> differs).
>>>
>>> Hope this clarifies,
>>>
>>> Roger
>>>
>>> On Wed, 12 Apr 2017, Maël Le Noc via R-sig-Geo wrote:
>>>
>>>> Dear List
>>>>
>>>> As I was working on a project, I realized that when I use dnearneigh
>>>> from spdep, two (or more) points that have the exact same coordinates
>>>> are not considered neighbours and thus are not linked (even when the
>>>> lower bound is put to 0 or even to -1). See below for an example.
>>>> (However this does not happen if the parameter longlat is set to
>>>> false)
>>>>
>>>> Does the function behave the same way for you? Am I missing something?
>>>> Is this an expected behavior? And if so, if there a way to change
>>>> that ?
>>>>
>>>> In the example below, points 1 and 2 are not connected to each
>>>> other/are
>>>> not neighbours (as you can see since the both have only one link,
>>>> to 3),
>>>> even though they have the exact same coordinates (and are thus less
>>>> than
>>>> 25km apart), while point 3 is connected to both point 1 and 2.
>>>> If I want to assess autocorrelation using, for instance
>>>> joincount.test,
>>>> this is then an issue...
>>>>
>>>>> /library(data.table) />/library(spdep) />/pointstable <-
>>>>> data.table(XCoord=c(13.667029,13.667029,13.667028),
>>>>> /YCoord=c(42.772396,42.772396,42.772396))
>>>>> /print(pointstable) /     XCoord  YCoord
>>>> 1: 13.667029 42.772396
>>>> 2: 13.667029 42.772396
>>>> 3: 13.667028 42.772396
>>>>> /coords <-cbind(pointstable$XCoord, pointstable$YCoord) />/nbLocal<-
>>>>> dnearneigh(coords, d1=0, d2=25, longlat = TRUE) />/nbLocal<-
>>>>> dnearneigh(coords, d1=-1, d2=25, longlat = TRUE) #both lines /produce
>>>>> the same output
>>>>> /summary(nbLocal) /Neighbour list object:
>>>> Number of regions: 3
>>>> Number of nonzero links: 4
>>>> Percentage nonzero weights: 44.44444
>>>> Average number of links: 1.333333
>>>> Link number distribution:
>>>>
>>>> 1 2
>>>> 2 1
>>>> 2 least connected regions:
>>>> 1 2 with 1 link
>>>> 1 most connected region:
>>>> 3 with 2 links
>>>>> //
>>>> Thanks
>>>> Maël
>>>>
>>>>
>>>>     [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>>
>



More information about the R-sig-Geo mailing list