[R-sig-Geo] best practice for reading large shapefiles?

Roger Bivand Roger.Bivand at nhh.no
Tue Apr 26 22:12:16 CEST 2016


On Tue, 26 Apr 2016, Vinh Nguyen wrote:

> Would loading the shapefile into postgresql first and then use readOGR
> to read from postgres be a recommended approach?  That is, would the
> bottleneck still occur?  Thank you.

Most likely, as both use the respective OGR drivers. With data this size, 
you'll need a competent platform (probably Linux, say 128GB RAM) as 
everything is in memory. I find it hard to grasp what the point of doing 
this might be - visualization won't work as none of the considerable 
detail certainly in these files will be visible. Can you put the lot into 
an SQLite file and access the attributes as SQL queries? I don't see the 
analysis or statistics here.

Roger

>
> -- Vinh
>
>
> On Tue, Apr 26, 2016 at 11:18 AM, Vinh Nguyen <vinhdizzo at gmail.com> wrote:
>> Hi,
>>
>> I have a very large shapefile that I would like to read into R
>> (dbf=5.6gb and shp=2.3gb).
>>
>> For reference, I downloaded the 30 shapefiles of the [Public Land
>> Survey System](http://www.geocommunicator.gov/GeoComm/lsis_home/home/)
>> and combined them into a single national file via gdal (ogr2ogr) as
>> described [here](http://www.northrivergeographic.com/ogr2ogr-merge-shapefiles);
>> I originally attempted to combine the files in R as described
>> [here](https://stat.ethz.ch/pipermail/r-sig-geo/2011-May/011814.html),
>> but ran out of memory about 80% in, but luckily discovered ogr2ogr.
>>
>> I'm reading in the combined file in R via readOGR, and it's been over
>> an hour and R appears to hang.  When I check the task manager, the R
>> session currently consumes <10% CPU and 245MB.  Not sure if any
>> productive activity is going on, so I'm just waiting it out.
>> [This](http://r-sig-geo.2731867.n2.nabble.com/Long-time-to-load-shapefiles-td7584869.html)
>> thread describes that readOGR can be slow for large shapefiles, and
>> suggested that the SpatialDataFrame be saved in an R format.  My
>> problem is getting the entire shapefile read in the first place before
>> I could save it as an R object.
>>
>> Does anyone have any suggestions for reading this large shapefile into
>> R?  Thank you for your help.
>>
>> -- Vinh
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
http://depsy.org/person/434412



More information about the R-sig-Geo mailing list