[R-sig-Geo] gDistance on a LARGE Spatial Object?
Roger Bivand
Roger.Bivand at nhh.no
Fri Oct 26 23:40:40 CEST 2012
On Fri, 26 Oct 2012, Wouter Steenbeek wrote:
> Disregarding solution1 (;-)), let me give you a practical example where
> I (think I) really need the full matrix. Or more specifically, an
> example in which I use the function gWithinDistance (but the same memory
> issues apply here).
>
> I have a points.sp object containing my units of analysis (e.g., 20000
> centroids of streets). I also have a buildings.sp object containing the
> XY coordinates of all 30000 buildings in that area. I want to calculate
> for each street centroid how many buildings are located within a radius
> of 50 meters of that centroid.
>
> So now I do:
>
> newbuilding <- rowSums(gWithinDistance(buildings.sp,points.sp,50,
> byid=T, hausdorff=FALSE, densifyFrac=NULL))
>
> However, this of course suffers from the same memory problem. I've tried
> your solution 2 "think: what do you want to do with all these distances,
> can you get there without having all distances in a single matrix at
> once?". But I can't think of anything. Any tips?
library(rgeos)
data(meuse)
coordinates(meuse) <- c("x", "y")
set.seed(1)
streets <- spsample(meuse, n=5000, type="random")
buildings <- spsample(meuse, n=6000, type="random")
res <- vector(mode="list", length=length(streets))
tic <- Sys.time()
for (i in 1:length(streets)) {
buf <- gBuffer(streets[1], width=50)
res[[i]] <- which(!is.na(over(buildings, buf)))
}
Sys.time() - tic
# 10 seconds, will be more for your data sizes
library(fortunes)
fortune("Yoda")
Roger
>
> Cheers, Wouter
>
>
> ________________________________________
> From: r-sig-geo-bounces at r-project.org [r-sig-geo-bounces at r-project.org] On Behalf Of Edzer Pebesma [edzer.pebesma at uni-muenster.de]
> Sent: Friday, October 26, 2012 9:25 PM
> To: r-sig-geo at r-project.org
> Subject: Re: [R-sig-Geo] gDistance on a LARGE Spatial Object?
>
> On 10/26/2012 08:46 PM, Wouter Steenbeek wrote:
>> I am trying to use the "gDistance" from the "geos" package to calculate the Euclidean distance between a large number of points: to be exact, 28000 points.
>>
>> I'm doing:
>> distance <- gDistance(points.sp)
>>
>> R doesn't run this on my 32-bit 3Gb RAM machine. The same function works on a subset of these points (say the first 400 points I call "points1_400.sp"). I think that R doesn't run on this is because gDistance keeps a matrix of 28000 * 28000 cells in RAM. With at one byte per cell, this amounts to 28000 * 28000 * 8 / (1024 bytes per Kb / 1024 Kb per Mb / 1024 Mb per Gb) = almost 6 Gb of RAM is needed.
>>
>> 1. Is my interpretation of the problem correct?
>
> Yes.
>>
>> 2. What would be the best way to tackle this problem? (chop up the points.sp into chunks, calculate distances per chunk, and then merge matrices together? I'm not sure how I can do that in R. Code examples of how to do is is appreciated)
>
> It will not make the object smaller, so will be difficult anyway on your
> 3 Gb machine. As usual, there are two solutions:
>
> 1. don't think: buy a computer with (much) more RAM;
>
> 2. think: what do you want to do with all these distances, can you get
> there without having all distances in a single matrix at once?
> --
> Edzer Pebesma
> Institute for Geoinformatics (ifgi), University of Münster
> Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
> 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de
> http://www.52north.org/geostatistics e.pebesma at wwu.de
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list