[R-sig-Geo] Creating Spatial Weight Matrices with Large Data
Roger Bivand
Roger@B|v@nd @end|ng |rom nhh@no
Tue Dec 3 11:52:51 CET 2019
On Tue, 3 Dec 2019, Chanda Chiseni wrote:
> Hi Roger
> Thank you for your very helpful feedback. I was indeed treating my point
> data as polygons and did not impose a distance thresh hold.Essentially, as
> you stated, many observations had many neighbors. I have since tried to you
> K-neighbors and imposed a restriction of k=4. However, this is still taking
> a bit long.
> #Increasing the memory capacity
> memory.limit(size = 80000)
> ## defining data
> censusdata= CensusFinal_Analysis_R1
> #Creating Matrix of Coordinates
> sp_point <- cbind(censusdata$X, censusdata$Y)
> colnames(sp_point)= c("Long","Lat")
> head(sp_point)
> ## Create the K nearest neighbour
> censusdata.4nn = knearneigh(sp_point,k=4,longlat = TRUE)
Don't use geographical coordinates. Project first, then K-nearest
neighbours uses RANN, which is fast (Euclidean as against Great Circle
> I get stuck at the stage where i try to create the K nearest neighbor, the
> operation is quite slow. Am i still doing something wrong?
> Kind Regards,
> Michael Chanda Chiseni
> Phd Candidate
> Department of Economic History
> Lund University
> Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund
> *Africa is not poor, it is poorly managed (Ellen Johnson-Sirleaf ). *
> On Mon, Dec 2, 2019 at 1:00 PM Roger Bivand <Roger.Bivand using nhh.no> wrote:
>> On Mon, 2 Dec 2019, Chanda Chiseni wrote:
>>> I am currently working with a census data that has about 758 000
>>> individuals. I am trying to create a spatial weight matrix using the X-Y
>>> coordinates for their place of birth . However, i am running into
>> problems
>>> when I try to create the nb type weights matrix using the poly2nb, R is
>>> taking super long and after running for a long time it crushes. I have
>>> increased R's memory size to about 80000 but this is still not working.
>> Please provide the (shortened) code used. poly2nb() is used for polygons,
>> not points. If you were using distances between points, you may have used
>> a distance threshold such that many observations have many neighbours.
>> Also ask yourself whether this is not a multi-level problem, in that
>> spatial interactions perhaps occur between aggregates of observations, not
>> the observations themselves.
>>> Is there a way i can get around this problem? If anyone has any ideas on
>>> how i can create a spatial weight matrix for such a large data set please
>>> help.
>> An nb object (and listw) are just lists of length n, so a neighbour object
>> with 800K observations and 4 neighbours each only takes about 13MB, the
>> listw takes 38MB. What you can use them for may be another problem, and
>> much of the data may actually simply be noise not signal.
>> Roger
>>> Kind Regards,
>>> Michael Chanda Chiseni
>>> Phd Candidate
>>> Department of Economic History
>>> Lund University
>>> Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund
>>> *Africa is not poor, it is poorly managed (Ellen Johnson-Sirleaf ). *
>>> [[alternative HTML version deleted]]
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo using r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
>> https://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
> [[alternative HTML version deleted]]
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
More information about the R-sig-Geo
mailing list