[R-sig-Geo] gDistance on a LARGE Spatial Object?

Roger Bivand Roger.Bivand at nhh.no
Fri Oct 26 23:47:00 CEST 2012


On Fri, 26 Oct 2012, Roger Bivand wrote:

> On Fri, 26 Oct 2012, Wouter Steenbeek wrote:
>
>> Disregarding solution1 (;-)), let me give you a practical example where I 
>> (think I) really need the full matrix. Or more specifically, an example in 
>> which I use the function gWithinDistance (but the same memory issues apply 
>> here).
>> 
>> I have a points.sp object containing my units of analysis (e.g., 20000 
>> centroids of streets). I also have a buildings.sp object containing the XY 
>> coordinates of all 30000 buildings in that area. I want to calculate for 
>> each street centroid how many buildings are located within a radius of 50 
>> meters of that centroid.
>> 
>> So now I do:
>> 
>> newbuilding <- rowSums(gWithinDistance(buildings.sp,points.sp,50, byid=T, 
>> hausdorff=FALSE, densifyFrac=NULL))
>> 
>> However, this of course suffers from the same memory problem. I've tried 
>> your solution 2 "think: what do you want to do with all these distances, 
>> can you get there without having all distances in a single matrix at 
>> once?". But I can't think of anything. Any tips?
>
> library(rgeos)
> data(meuse)
> coordinates(meuse) <- c("x", "y")
> set.seed(1)
> streets <- spsample(meuse, n=5000, type="random")
> buildings <- spsample(meuse, n=6000, type="random")
> res <- vector(mode="list", length=length(streets))
> tic <- Sys.time()
> for (i in 1:length(streets)) {
> buf <- gBuffer(streets[1], width=50)

  buf <- gBuffer(streets[i], width=50)

Sorry! Now 16 seconds - add library(rgeos) first too.

Roger

> res[[i]] <- which(!is.na(over(buildings, buf)))
> }
> Sys.time() - tic
> # 10 seconds, will be more for your data sizes
> library(fortunes)
> fortune("Yoda")
>
> Roger
>
>> 
>> Cheers, Wouter
>> 
>> 
>> ________________________________________
>> From: r-sig-geo-bounces at r-project.org [r-sig-geo-bounces at r-project.org] On 
>> Behalf Of Edzer Pebesma [edzer.pebesma at uni-muenster.de]
>> Sent: Friday, October 26, 2012 9:25 PM
>> To: r-sig-geo at r-project.org
>> Subject: Re: [R-sig-Geo] gDistance on a LARGE Spatial Object?
>> 
>> On 10/26/2012 08:46 PM, Wouter Steenbeek wrote:
>>> I am trying to use the "gDistance" from the "geos" package to calculate 
>>> the Euclidean distance between a large number of points: to be exact, 
>>> 28000 points.
>>> 
>>> I'm doing:
>>> distance <- gDistance(points.sp)
>>> 
>>> R doesn't run this on my 32-bit 3Gb RAM machine. The same function works 
>>> on a subset of these points (say the first 400 points I call 
>>> "points1_400.sp"). I think that R doesn't run on this is because gDistance 
>>> keeps a matrix of 28000 * 28000 cells in RAM. With at one byte per cell, 
>>> this amounts to 28000 * 28000 * 8 / (1024 bytes per Kb / 1024 Kb per Mb / 
>>> 1024 Mb per Gb) = almost 6 Gb of RAM is needed.
>>> 
>>> 1. Is my interpretation of the problem correct?
>> 
>> Yes.
>>> 
>>> 2. What would be the best way to tackle this problem? (chop up the 
>>> points.sp into chunks, calculate distances per chunk, and then merge 
>>> matrices together? I'm not sure how I can do that in R. Code examples of 
>>> how to do is is appreciated)
>> 
>> It will not make the object smaller, so will be difficult anyway on your
>> 3 Gb machine. As usual, there are two solutions:
>> 
>> 1. don't think: buy a computer with (much) more RAM;
>> 
>> 2. think: what do you want to do with all these distances, can  you get
>> there without having all distances in a single matrix at once?
>> --
>> Edzer Pebesma
>> Institute for Geoinformatics (ifgi), University of Münster
>> Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
>> 8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
>> http://www.52north.org/geostatistics      e.pebesma at wwu.de
>> 
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> 
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> 
>
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list