[R-sig-Geo] best practice for reading large shapefiles?

Vinh Nguyen vinhdizzo at gmail.com
Tue Apr 26 20:18:26 CEST 2016


Hi,

I have a very large shapefile that I would like to read into R
(dbf=5.6gb and shp=2.3gb).

For reference, I downloaded the 30 shapefiles of the [Public Land
Survey System](http://www.geocommunicator.gov/GeoComm/lsis_home/home/)
and combined them into a single national file via gdal (ogr2ogr) as
described [here](http://www.northrivergeographic.com/ogr2ogr-merge-shapefiles);
I originally attempted to combine the files in R as described
[here](https://stat.ethz.ch/pipermail/r-sig-geo/2011-May/011814.html),
but ran out of memory about 80% in, but luckily discovered ogr2ogr.

I'm reading in the combined file in R via readOGR, and it's been over
an hour and R appears to hang.  When I check the task manager, the R
session currently consumes <10% CPU and 245MB.  Not sure if any
productive activity is going on, so I'm just waiting it out.
[This](http://r-sig-geo.2731867.n2.nabble.com/Long-time-to-load-shapefiles-td7584869.html)
thread describes that readOGR can be slow for large shapefiles, and
suggested that the SpatialDataFrame be saved in an R format.  My
problem is getting the entire shapefile read in the first place before
I could save it as an R object.

Does anyone have any suggestions for reading this large shapefile into
R?  Thank you for your help.

-- Vinh



More information about the R-sig-Geo mailing list