[R-sig-Geo] speed problem with %over% function (Francis Markham)

gregor.hochschild at gmx.de gregor.hochschild at gmx.de
Wed Feb 29 14:27:55 CET 2012


I made the same experience as Francis Markham. I recently used the over function to check in which polygon from a shapefile a large number of points fall, and it crashed because of memory requirements. I was running the function in the cloud with 12GB of RAM...

Here are some details and below is reproducible code together with data:
600,000 points, SpatialPolygonsDataFrame with 2166 rows
over(SpatialPoints.Obj,SpatialPolygonsDataFrame.Obj)
I ended up processing the points in chunks of 100,000, which worked quit well but it's far from optimal. 


Greg



# data with sample of 300,000 points
# http://www.mediafire.com/?d1ds17guxg656s1,s5snsbafvps36fr
# (I will take these files down after some time)

require ("rgdal")
require ("maptools")

# load data
map.proj = CRS(" +proj=lcc +lat_1=40.66666666666666 +lat_2=41.03333333333333 +lat_0=40.16666666666666 +lon_0=-74 +x_0=300000 +y_0=0 +datum=NAD83 +units=us-ft +no_defs +ellps=GRS80 +towgs84=0,0,0")
points   = read.csv('~/Dropbox/sp-over/points.csv',sep=",", stringsAsFactors=FALSE)
map.nyc  = readShapeSpatial('/Users/jpl2136/Dropbox/sp-over/nyc-census-tract/nyct2010',proj4string=map.proj)

# project points
coord    =project(cbind(points$longitude,points$latitude),proj4string(map.nyc))
sppoints =SpatialPoints(coord,proj4string=map.proj)
# over function
census.track=sp::over(sppoints,map.nyc)






-- 

Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a



More information about the R-sig-Geo mailing list