[R-sig-Geo] gDistance on a LARGE Spatial Object?

Wouter Steenbeek WSteenbeek at nscr.nl
Fri Oct 26 23:13:03 CEST 2012


Disregarding solution1 (;-)), let me give you a practical example where I (think I) really need the full matrix. Or more specifically, an example in which I use the function gWithinDistance (but the same memory issues apply here).

I have a points.sp object containing my units of analysis (e.g., 20000 centroids of streets). I also have a buildings.sp object containing the XY coordinates of all 30000 buildings in that area. I want to calculate for each street centroid how many buildings are located within a radius of 50 meters of that centroid.

So now I do:

newbuilding <- rowSums(gWithinDistance(buildings.sp,points.sp,50, byid=T, hausdorff=FALSE, densifyFrac=NULL))

However, this of course suffers from the same memory problem. I've tried your solution 2 "think: what do you want to do with all these distances, can you get there without having all distances in a single matrix at once?". But I can't think of anything. Any tips?

Cheers, Wouter


________________________________________
From: r-sig-geo-bounces at r-project.org [r-sig-geo-bounces at r-project.org] On Behalf Of Edzer Pebesma [edzer.pebesma at uni-muenster.de]
Sent: Friday, October 26, 2012 9:25 PM
To: r-sig-geo at r-project.org
Subject: Re: [R-sig-Geo] gDistance on a LARGE Spatial Object?

On 10/26/2012 08:46 PM, Wouter Steenbeek wrote:
> I am trying to use the "gDistance" from the "geos" package to calculate the Euclidean distance between a large number of points: to be exact, 28000 points.
>
> I'm doing:
> distance <- gDistance(points.sp)
>
> R doesn't run this on my 32-bit 3Gb RAM machine. The same function works on a subset of these points (say the first 400 points I call "points1_400.sp"). I think that R doesn't run on this is because gDistance keeps a matrix of 28000 * 28000 cells in RAM. With at one byte per cell, this amounts to 28000 * 28000 * 8 / (1024 bytes per Kb / 1024 Kb per Mb / 1024 Mb per Gb) = almost 6 Gb of RAM is needed.
>
> 1. Is my interpretation of the problem correct?

Yes.
>
> 2. What would be the best way to tackle this problem? (chop up the points.sp into chunks, calculate distances per chunk, and then merge matrices together? I'm not sure how I can do that in R. Code examples of how to do is is appreciated)

It will not make the object smaller, so will be difficult anyway on your
3 Gb machine. As usual, there are two solutions:

1. don't think: buy a computer with (much) more RAM;

2. think: what do you want to do with all these distances, can  you get
there without having all distances in a single matrix at once?
--
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
http://www.52north.org/geostatistics      e.pebesma at wwu.de

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-sig-Geo mailing list