[R-sig-Geo] Creating Spatial Weight Matrices with Large Data

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Tue Dec 3 11:52:51 CET 2019


On Tue, 3 Dec 2019, Chanda Chiseni wrote:

> Hi Roger
>
> Thank you for your very helpful feedback. I was indeed treating my point
> data as polygons and did not impose a distance thresh hold.Essentially, as
> you stated, many observations had many neighbors. I have since tried to you
> K-neighbors and imposed a restriction of k=4. However, this is still taking
> a bit long.
>
> #Increasing the memory capacity
> memory.limit(size = 80000)
> ## defining data
> censusdata= CensusFinal_Analysis_R1
>
> #Creating Matrix of Coordinates
> sp_point <- cbind(censusdata$X, censusdata$Y)
>
> colnames(sp_point)= c("Long","Lat")
> head(sp_point)
>
> ## Create the K nearest neighbour
> censusdata.4nn = knearneigh(sp_point,k=4,longlat = TRUE)

Don't use geographical coordinates. Project first, then K-nearest 
neighbours uses RANN, which is fast (Euclidean as against Great Circle 
distances).

Roger

>
> I get stuck at the stage where i try to create the K nearest neighbor, the
> operation is quite slow. Am i still doing something wrong?
>
>
> Kind Regards,
>
> Michael Chanda Chiseni
>
> Phd Candidate
>
> Department of Economic History
>
> Lund University
>
> Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund
>
>
>
> *Africa is not poor, it is poorly managed (Ellen Johnson-Sirleaf ). *
>
>
>
>
>
>
> On Mon, Dec 2, 2019 at 1:00 PM Roger Bivand <Roger.Bivand using nhh.no> wrote:
>
>> On Mon, 2 Dec 2019, Chanda Chiseni wrote:
>>
>>> I am currently working with a census data that has about 758 000
>>> individuals. I am trying to create a spatial weight matrix using the X-Y
>>> coordinates for their place of birth . However, i am running into
>> problems
>>> when I try to create the nb type weights matrix using the poly2nb, R is
>>> taking super long and after running for a long time it crushes. I have
>>> increased R's memory size to about 80000 but this is still not working.
>>
>> Please provide the (shortened) code used. poly2nb() is used for polygons,
>> not points. If you were using distances between points, you may have used
>> a distance threshold such that many observations have many neighbours.
>> Also ask yourself whether this is not a multi-level problem, in that
>> spatial interactions perhaps occur between aggregates of observations, not
>> the observations themselves.
>>
>>>
>>> Is there a way i can get around this problem? If anyone has any ideas on
>>> how i can create a spatial weight matrix for such a large data set please
>>> help.
>>
>> An nb object (and listw) are just lists of length n, so a neighbour object
>> with 800K observations and 4 neighbours each only takes about 13MB, the
>> listw takes 38MB. What you can use them for may be another problem, and
>> much of the data may actually simply be noise not signal.
>>
>> Roger
>>
>>>
>>> Kind Regards,
>>>
>>>
>>> Michael Chanda Chiseni
>>>
>>> Phd Candidate
>>>
>>> Department of Economic History
>>>
>>> Lund University
>>>
>>> Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund
>>>
>>>
>>>
>>> *Africa is not poor, it is poorly managed (Ellen Johnson-Sirleaf ). *
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo using r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
>> https://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en


More information about the R-sig-Geo mailing list