[R-sig-Geo] Distance-based neighbourhood statistics problem
Roger Bivand
Roger.Bivand at nhh.no
Fri Sep 12 09:13:18 CEST 2008
On Thu, 11 Sep 2008, Gellrich Mario wrote:
> Hi again,
>
>
> no, the code line is not the problem - what's shown in the first message are just the first 10 lines of the data set. The problem seems to be the coordinates of the data. I have reduced the data set to only seven points to show the problem:
>
> # Example data set:
>
> d.geb <- read.table("geb_test.csv", header=T, sep=";")
> d.geb
>
> GEBID ANZAHLGEB ABWOW XACH YACH Ortsname
> 1 896967 1 1 682886 226329 gebaeude
> 2 915184 1 0 682949 226280 gebaeude
> 3 988432 1 6 682960 226315 gebaeude
> 4 1070819 1 1 682939 226282 gebaeude
> 5 1070925 1 3 682991 226290 gebaeude
> 6 1413991 1 0 682905 226297 gebaeude
> 7 -99 0 0 682934 226299 TestOrt
^^^^^^^^^^^^^
This looks like the problem - these are not decimal degrees as expected by
the longlat=TRUE used in dnearneigh().
> xy <- cbind(c(682886, 682949, 682960, 682939, 682991, 682905, 682934),
c(226329, 226280, 226315, 226282, 226290, 226297, 226299))
> dnearneigh(xy, 0, 500)
Neighbour list object:
Number of regions: 7
Number of nonzero links: 42
Percentage nonzero weights: 85.71429
Average number of links: 6
> dnearneigh(xy, 0, 500, longlat=TRUE)
Neighbour list object:
Number of regions: 7
Number of nonzero links: 0
Percentage nonzero weights: 0
Average number of links: 0
7 regions with no links:
If the coordinates are decimal coordinates but multiplied by 1000 to
store as integer, we get:
> dnearneigh(xy/1000, 0, 500, longlat=TRUE)
Neighbour list object:
Number of regions: 7
Number of nonzero links: 42
Percentage nonzero weights: 85.71429
Average number of links: 6
If you get that right first, the rest should follow.
Roger
>
>
> # Here ANZAHLGEB is the number of buildings per data point (= 1, except for the municipality); ABWOW is the number of flats in the building.
>
> library(spdep)
>
> geb.nb2 <- dnearneigh(as.matrix(d.geb[,4:5]), 0, 500, longlat=TRUE)
> d.geb$SUMGEB <- unlist(lapply(geb.nb2, function(x) ifelse(any(x==0), 0, sum(d.geb$ANZAHLGEB[x]))))
> d.geb
>
> # The outcome 'SUMGEB' is zero (= no buildings within a distance of 500 meters) - this despite 500 m is the distance in wich all points have at least one neighbour:
>
> GEBID ANZAHLGEB ABWOW XACH YACH Ortsname SUMGEB
> 1 896967 1 1 682886 226329 gebaeude 0
> 2 915184 1 0 682949 226280 gebaeude 0
> 3 988432 1 6 682960 226315 gebaeude 0
> 4 1070819 1 1 682939 226282 gebaeude 0
> 5 1070925 1 3 682991 226290 gebaeude 0
> 6 1413991 1 0 682905 226297 gebaeude 0
> 7 -99 0 0 682934 226299 TestOrt 0
>
>
> ### If I calculate the Euclidean Distanz between data point 2 and 4, it gives you the correct distance (in meters):
>
>
> d.dist <- sqrt( (682949 - 682939)^2 + (226280 - 226282)^2 )
> d.dist
>
> [1] 10.19804
>
> # It seems to be that dnearneigh uses another coordinate system (see also the X and Y variables in the columbus data set which comes with spdep)
>
>
> Best regards,
>
> Mario
>
>
>
>
>
>
>
> Hi Mario,
> I don´t know if it is the problem, but the code line below need be changed:
>
> subset(d.geb, d.geb$MUNICIP == 1)
>
> best wishes,
>
> miltinho astronauta
> brazil
>
> On Thu, Sep 11, 2008 at 8:12 AM, Gellrich Mario
> <mario.gellrich at env.ethz.ch>wrote:
>
>> Hi,
>>
>> I've got a question regarding distance-based neighbourhood statistics using
>> two separate spatial data sets. What I have are municipalities and buildings
>> within municipalities which both come with x- and y-coordinates. I merged
>> the two datasets row-wise and used the following code to obtain aggregated
>> values for the number of builings surrounding a municipality. The code hower
>> doesn't work appropriately. Can anybody help?
>>
>>
>> d.geb <- read.table("gebaeuderecord_selection_test.csv", header=T, sep=";")
>> d.geb[1:10,]
>>
>> MUNICIP NAME GEBID NUMBERGEB ABPER LON LAT
>> 1 0 Gebaeude 4262 1 1 681864 225868
>> 2 0 Gebaeude 14168 1 7 682190 224294
>> 3 0 Gebaeude 33346 1 5 682176 224354
>> 4 0 Gebaeude 44610 1 15 681571 225189
>> 5 0 Gebaeude 62785 1 1 679800 226040
>> 6 0 Gebaeude 72287 1 1 684619 224239
>> 7 0 Gebaeude 83044 1 8 681701 224818
>> 8 0 Gebaeude 84827 1 2 684878 230975
>> 9 0 Gebaeude 86671 1 0 685733 227247
>> 10 0 Gebaeude 88022 1 1 685691 230548
>>
>>
>> # Variable description:
>>
>> # MUNICIP: zero for buildings; one for municipalities
>> # NAME : if 'Gebaeude' == building; otherwise municipality name
>> # GEBID: ID of building; -99 otherwise
>> # NUMBERGEB: number of buildings - one for each building-record; set to
>> zero if record represents municipality
>> # ABWOW: number of flats in a building
>>
>>
>> ### Creates neighbourhood list and should provide aggregates values within
>> a distance of 1 kilometer from a municipality for the variables NUMBERGEB or
>> ABWOW
>>
>> library(spdep)
>> geb.nb2 <- dnearneigh(as.matrix(d.geb[,6:7]), 0, 1, longlat=TRUE)
>> d.geb$SUMGEB <- unlist(lapply(geb.nb2, function(x) ifelse(any(x==0), 0,
>> sum(d.geb$ABPER[x]))))
>>
>> subset(d.geb, MUNICIP == 1)
>>
>>
>> Best regards,
>>
>> Mario
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list