# [R-sig-Geo] gDistance on a LARGE Spatial Object?

Wouter Steenbeek WSteenbeek at nscr.nl
Sat Oct 27 00:06:33 CEST 2012

```About 15.5 seconds here. This is awesome, thanks!!

________________________________________
From: Roger Bivand [Roger.Bivand at nhh.no]
Sent: Friday, October 26, 2012 11:47 PM
To: Wouter Steenbeek
Cc: r-sig-geo at r-project.org
Subject: Re: [R-sig-Geo] gDistance on a LARGE Spatial Object?

On Fri, 26 Oct 2012, Roger Bivand wrote:

> On Fri, 26 Oct 2012, Wouter Steenbeek wrote:
>
>> Disregarding solution1 (;-)), let me give you a practical example where I
>> (think I) really need the full matrix. Or more specifically, an example in
>> which I use the function gWithinDistance (but the same memory issues apply
>> here).
>>
>> I have a points.sp object containing my units of analysis (e.g., 20000
>> centroids of streets). I also have a buildings.sp object containing the XY
>> coordinates of all 30000 buildings in that area. I want to calculate for
>> each street centroid how many buildings are located within a radius of 50
>> meters of that centroid.
>>
>> So now I do:
>>
>> newbuilding <- rowSums(gWithinDistance(buildings.sp,points.sp,50, byid=T,
>> hausdorff=FALSE, densifyFrac=NULL))
>>
>> However, this of course suffers from the same memory problem. I've tried
>> your solution 2 "think: what do you want to do with all these distances,
>> can you get there without having all distances in a single matrix at
>> once?". But I can't think of anything. Any tips?
>
> library(rgeos)
> data(meuse)
> coordinates(meuse) <- c("x", "y")
> set.seed(1)
> streets <- spsample(meuse, n=5000, type="random")
> buildings <- spsample(meuse, n=6000, type="random")
> res <- vector(mode="list", length=length(streets))
> tic <- Sys.time()
> for (i in 1:length(streets)) {
> buf <- gBuffer(streets[1], width=50)

buf <- gBuffer(streets[i], width=50)

Sorry! Now 16 seconds - add library(rgeos) first too.

Roger

> res[[i]] <- which(!is.na(over(buildings, buf)))
> }
> Sys.time() - tic
> # 10 seconds, will be more for your data sizes
> library(fortunes)
> fortune("Yoda")
>
> Roger
>
>>
>> Cheers, Wouter
>>
>>
>> ________________________________________
>> From: r-sig-geo-bounces at r-project.org [r-sig-geo-bounces at r-project.org] On
>> Behalf Of Edzer Pebesma [edzer.pebesma at uni-muenster.de]
>> Sent: Friday, October 26, 2012 9:25 PM
>> To: r-sig-geo at r-project.org
>> Subject: Re: [R-sig-Geo] gDistance on a LARGE Spatial Object?
>>
>> On 10/26/2012 08:46 PM, Wouter Steenbeek wrote:
>>> I am trying to use the "gDistance" from the "geos" package to calculate
>>> the Euclidean distance between a large number of points: to be exact,
>>> 28000 points.
>>>
>>> I'm doing:
>>> distance <- gDistance(points.sp)
>>>
>>> R doesn't run this on my 32-bit 3Gb RAM machine. The same function works
>>> on a subset of these points (say the first 400 points I call
>>> "points1_400.sp"). I think that R doesn't run on this is because gDistance
>>> keeps a matrix of 28000 * 28000 cells in RAM. With at one byte per cell,
>>> this amounts to 28000 * 28000 * 8 / (1024 bytes per Kb / 1024 Kb per Mb /
>>> 1024 Mb per Gb) = almost 6 Gb of RAM is needed.
>>>
>>> 1. Is my interpretation of the problem correct?
>>
>> Yes.
>>>
>>> 2. What would be the best way to tackle this problem? (chop up the
>>> points.sp into chunks, calculate distances per chunk, and then merge
>>> matrices together? I'm not sure how I can do that in R. Code examples of
>>> how to do is is appreciated)
>>
>> It will not make the object smaller, so will be difficult anyway on your
>> 3 Gb machine. As usual, there are two solutions:
>>
>> 1. don't think: buy a computer with (much) more RAM;
>>
>> 2. think: what do you want to do with all these distances, can  you get
>> there without having all distances in a single matrix at once?
>> --
>> Edzer Pebesma
>> Institute for Geoinformatics (ifgi), University of Münster
>> Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
>> 8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
>> http://www.52north.org/geostatistics      e.pebesma at wwu.de
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
>

--
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

```