[R-sig-Geo] gDistance on a LARGE Spatial Object?

Edzer Pebesma edzer.pebesma at uni-muenster.de
Fri Oct 26 21:25:09 CEST 2012



On 10/26/2012 08:46 PM, Wouter Steenbeek wrote:
> I am trying to use the "gDistance" from the "geos" package to calculate the Euclidean distance between a large number of points: to be exact, 28000 points.
> 
> I'm doing:
> distance <- gDistance(points.sp) 
> 
> R doesn't run this on my 32-bit 3Gb RAM machine. The same function works on a subset of these points (say the first 400 points I call "points1_400.sp"). I think that R doesn't run on this is because gDistance keeps a matrix of 28000 * 28000 cells in RAM. With at one byte per cell, this amounts to 28000 * 28000 * 8 / (1024 bytes per Kb / 1024 Kb per Mb / 1024 Mb per Gb) = almost 6 Gb of RAM is needed.
> 
> 1. Is my interpretation of the problem correct?

Yes.
> 
> 2. What would be the best way to tackle this problem? (chop up the points.sp into chunks, calculate distances per chunk, and then merge matrices together? I'm not sure how I can do that in R. Code examples of how to do is is appreciated)

It will not make the object smaller, so will be difficult anyway on your
3 Gb machine. As usual, there are two solutions:

1. don't think: buy a computer with (much) more RAM;

2. think: what do you want to do with all these distances, can  you get
there without having all distances in a single matrix at once?
-- 
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
http://www.52north.org/geostatistics      e.pebesma at wwu.de



More information about the R-sig-Geo mailing list