[R-sig-Geo] best practice for reading large shapefiles?

Vinh Nguyen vinhdizzo at gmail.com
Tue Apr 26 21:11:17 CEST 2016


Would loading the shapefile into postgresql first and then use readOGR
to read from postgres be a recommended approach?  That is, would the
bottleneck still occur?  Thank you.

-- Vinh


On Tue, Apr 26, 2016 at 11:18 AM, Vinh Nguyen <vinhdizzo at gmail.com> wrote:
> Hi,
>
> I have a very large shapefile that I would like to read into R
> (dbf=5.6gb and shp=2.3gb).
>
> For reference, I downloaded the 30 shapefiles of the [Public Land
> Survey System](http://www.geocommunicator.gov/GeoComm/lsis_home/home/)
> and combined them into a single national file via gdal (ogr2ogr) as
> described [here](http://www.northrivergeographic.com/ogr2ogr-merge-shapefiles);
> I originally attempted to combine the files in R as described
> [here](https://stat.ethz.ch/pipermail/r-sig-geo/2011-May/011814.html),
> but ran out of memory about 80% in, but luckily discovered ogr2ogr.
>
> I'm reading in the combined file in R via readOGR, and it's been over
> an hour and R appears to hang.  When I check the task manager, the R
> session currently consumes <10% CPU and 245MB.  Not sure if any
> productive activity is going on, so I'm just waiting it out.
> [This](http://r-sig-geo.2731867.n2.nabble.com/Long-time-to-load-shapefiles-td7584869.html)
> thread describes that readOGR can be slow for large shapefiles, and
> suggested that the SpatialDataFrame be saved in an R format.  My
> problem is getting the entire shapefile read in the first place before
> I could save it as an R object.
>
> Does anyone have any suggestions for reading this large shapefile into
> R?  Thank you for your help.
>
> -- Vinh



More information about the R-sig-Geo mailing list