[R-sig-Geo] Exporting Multipolygon geojson with rgdal::writeOGR

Roger Bivand Roger.Bivand at nhh.no
Tue Nov 3 00:09:25 CET 2015


On Mon, 2 Nov 2015, Andy Teucher wrote:

> Amazing (and fast) work Roger, and thanks for the great explanation.
> I can see why you wouldn't want to redo the way Polygons objects are
> structured - I'm sure it would be break a huge amount of existing
> code.
>
> Just curious, does OSX take longer because you have to deal with
> multiple possible GDAL versions?

No, the builds of the underlying external dependencies for the different 
platforms are handled by the CRAN administrators, Simon Urbanek for OSX, 
and I believe he has to trigger a fresh build manually. Windows (Uwe 
Ligges and Brian Ripley) also has the R Winbuilder service, which is worth 
knowing about, and is more automated.

Once the Windows build train compilers are upgraded, we'll move towards 
getting GDAL 2.0.* on these platforms.

For those using OSX, following the R-sig-mac list is very informative - 
see the thread in October on El Capitan ... of course working around 
things like this take time and attention.

Roger

>
> Thanks again,
> Andy
>
> On Mon, Nov 2, 2015 at 1:21 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>> Hi Andy,
>>
>> On Mon, 26 Oct 2015, Andy Teucher wrote:
>>
>>> Hi Roger,
>>>
>>> Sorry I haven't been much help here - it's not for lack of trying,
>>> just lack of skills. I've been poring through OGR_write.cpp and trying
>>> to figure out language and the flow of things... a few thoughts
>>> occurred to me (which are probably obvious to you):
>>>
>>
>> Your example was very helpful, the problem was that when writeOGR() and the
>> underlying compiled code was written, the model used was that of the ESRI
>> Shapefile driver, rather than OGC SFS, which was first published in 2004.
>> Most older formats (and their drivers in OGR) use a OGR geometry factory
>> function to organize polygons on reading - but this is not present in
>> shapelib (bundled with the R maptools package), so when designing polygon
>> classes for sp, and writing rgdal::writeOGR() about 10 years ago, this was
>> not known, and was not revisited subsequently, until you happily spotted the
>> bug.
>>
>> GEOS in rgeos uses OGC SFS definitions (the ones you are using below), but
>> the SpatialPolygons/Polygons/Polygon structure was already established by
>> then, and a fix (using comment attributes in Polygons objects) was used,
>> rather than a major change to Polygons objects (in 2010).
>>
>> The revision to rgdal::writeOGR() now branches on the input objects - if
>> they have comments (applied in code, not user-facing), these are trusted,
>> and the MultiPolygon objects constructed containing one or more Polygon
>> objects with one exterior ring and zero or more interior rings on that
>> basis. However, if the input objects do not have comment attributes,
>> OGRGeometryFactory::organizePolygons() (C++) is used to create properly
>> organized MultiPolygon objects for export.
>>
>> This means that the pre-OGC SFS drivers continue to work, but that the OGC
>> SFS drivers, like GeoJSON, now also work correctly when the second ring in
>> an object is not assumed to be an interior ring (rgdal < 1.1-1). A test set
>> for all the possible cases has been introduced.
>>
>> The revision is now in source package form on CRAN:
>>
>> https://cran.r-project.org/web/packages/rgdal/index.html
>>
>> and a Windows binary package will be available later on Tuesday I hope. The
>> OSX revision may take a little longer, but we are grateful to have what we
>> have anyway.
>>
>> Thanks for a clear and very helpful bug report.
>>
>> Best wishes,
>>
>> Roger
>>
>>
>>> 1) Is this bug because holes are standalone polygons in sp objects,
>>> while they are members of polygon objects in geojson, wkt, etc?
>>>
>>> 2) On a related note, I notice that a single polygon with a hole is
>>> identified as a multipolygon containing a single polygon with a hole,
>>> when using writeOGR and one of the affected drivers.  Practically I
>>> don't think this matters too much, but it's not quite right.
>>>
>>> 3) In lines 116-160 the polygon/multipolygon and line/multiline
>>> determination is made for the entire layer. Can/should this be made
>>> for each feature in the layer, as you could have a layer with a
>>> mixture of single polygons and multipolygons?
>>>
>>> 4) I feel like there ought to be one more layer of looping through the
>>> polyons. e.g.:
>>>
>>> loop over each feature;
>>>     determine whether each feature is a single polygon (length 1 or
>>> all rings but one are holes), or a multipolygon:
>>>        if a polygon loop over and add ring(s);
>>>        if a multipolygon loop over polygons:
>>>            for each polygon add ring(s) and somehow assign holes to
>>> parent polygons
>>>
>>> Again, apologies if I'm being totally obvious (or way off base).
>>> Thanks for all your work on these packages, and on this issue.
>>>
>>> Andy
>>>
>>>
>>> On Thu, Oct 22, 2015 at 11:07 PM, Roger Bivand <Roger.Bivand at nhh.no>
>>> wrote:
>>>>
>>>> On Fri, 23 Oct 2015, Andy Teucher wrote:
>>>>
>>>>> On my Mac, I am running rgdal 1.0-7 and GDAL 1.11.3
>>>>>
>>>>>
>>>>> sessionInfo()
>>>>>
>>>>
>>>> OK, thanks.
>>>>
>>>> The problem also affects the SQLite driver (the CRAN binaries do not have
>>>> this driver). Could someone please check the PostGIS driver - my guess is
>>>> that all OGC SFS compliant drivers may be affected.
>>>>
>>>> Could somebody also please check whether this is a recently introduced
>>>> issue
>>>> or not? For example somebody still on rgdal < 1.0? src/OGR_write.cpp was
>>>> changed 2015-08-21, but the last previous change was 2015-06-11, then
>>>> 2015-05-31, 2014-08-17 ... from the ChangeLog visible on the package CRAN
>>>> page.
>>>>
>>>> Roger
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> R version 3.2.2 (2015-08-14)
>>>>>
>>>>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>>>>
>>>>> Running under: OS X 10.11 (El Capitan)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> locale:
>>>>>
>>>>> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> attached base packages:
>>>>>
>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> other attached packages:
>>>>>
>>>>> [1] rgdal_1.0-7 sp_1.2-1
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>>
>>>>> [1] tools_3.2.2     grid_3.2.2      lattice_0.20-33
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Andy
>>>>>
>>>>> On Thu, Oct 22, 2015 at 11:28 AM, Andy Teucher <andy.teucher at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> One more test: WKT has the same issue:
>>>>>> library(rgdal)
>>>>>> js <- '{
>>>>>> "type": "MultiPolygon",
>>>>>> "coordinates": [[[[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0,
>>>>>> 3.0],
>>>>>> [102.0, 2.0]]], [[[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0,
>>>>>> 1.0],
>>>>>> [100.0, 0.0]]]]
>>>>>> } '
>>>>>> spdf <- readOGR(js, layer='OGRGeoJSON', verbose=FALSE)
>>>>>> writeOGR(spdf, dsn = "test.csv", layer = "test", driver="CSV",
>>>>>> layer_options = "GEOMETRY=AS_WKT")
>>>>>> cat(readLines("test.csv"), sep = "\n")
>>>>>> WKT,FID,
>>>>>> "MULTIPOLYGON (((102 2,102 3,103 3,103 2,102 2),(100 0,100 1,101 1,101
>>>>>> 0,100 0)))",0
>>>>>> It should be:
>>>>>> WKT,FID,
>>>>>> "MULTIPOLYGON (((102 2,102 3,103 3,103 2,102 2)),((100 0,100 1,101
>>>>>> 1,101 0,100 0)))",0
>>>>>> Andy
>>>>>> On Thu, Oct 22, 2015 at 10:36 AM, Andy Teucher <andy.teucher at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Thanks very much Roger.
>>>>>>>
>>>>>>> I've replicated this on my work machine (Windows) and on my Mac at
>>>>>>> home.
>>>>>>>
>>>>>>> KML and GML are both affected. ESRI shapefile works fine and, though
>>>>>>> I'm not very familiar with the format, 'MapInfo File' works too
>>>>>>> (loading the file in qgis shows a multipart polygon with no geometry
>>>>>>> errors).
>>>>>>>
>>>>>>> sessionInfo() for my Windows machine (using rgdal 1.0-7 with included
>>>>>>> gdal drivers):
>>>>>>>
>>>>>>> R version 3.2.2 (2015-08-14)
>>>>>>> Platform: i386-w64-mingw32/i386 (32-bit)
>>>>>>> Running under: Windows 7 x64 (build 7601) Service Pack 1
>>>>>>>
>>>>>>> locale:
>>>>>>> [1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
>>>>>>> LC_MONETARY=English_Canada.1252
>>>>>>> [4] LC_NUMERIC=C                    LC_TIME=English_Canada.1252
>>>>>>>
>>>>>>> attached base packages:
>>>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>>>
>>>>>>> other attached packages:
>>>>>>> [1] rgdal_1.0-7 sp_1.2-0
>>>>>>>
>>>>>>> loaded via a namespace (and not attached):
>>>>>>> [1] tools_3.2.2     grid_3.2.2      lattice_0.20-33
>>>>>>>
>>>>>>>
>>>>>>> I'll send the details for my Mac environment this evening.
>>>>>>>
>>>>>>> I'll dig into the source code as soon as I can - C is a completely
>>>>>>> foreign language to me, so I can't promise anything fast!
>>>>>>>
>>>>>>> Thanks again,
>>>>>>> Andy
>>>>>>>
>>>>>>> On Thu, Oct 22, 2015 at 12:27 AM, Roger Bivand <Roger.Bivand at nhh.no>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, 22 Oct 2015, Andy Teucher wrote:
>>>>>>>>
>>>>>>>>> I’m finding that writeOGR isn’t exporting multipolygons properly
>>>>>>>>> using
>>>>>>>>> the GeoJSON driver. I have a simple test case (borrowed from here:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://gis.stackexchange.com/questions/137977/writeogr-alters-multipolygon-holes)
>>>>>>>>> with a geojson string with one multipolygon containing two polygons.
>>>>>>>>> I
>>>>>>>>> use readOGR to create a SpatialPolygonsDataFrame out of it, then
>>>>>>>>> write
>>>>>>>>> it with writeOGR:
>>>>>>>>>
>>>>>>>>> library(rgdal)
>>>>>>>>>
>>>>>>>>> js <- '{
>>>>>>>>> "type": "MultiPolygon",
>>>>>>>>> "coordinates": [[[[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0,
>>>>>>>>> 3.0],
>>>>>>>>> [102.0, 2.0]]], [[[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0,
>>>>>>>>> 1.0],
>>>>>>>>> [100.0, 0.0]]]]
>>>>>>>>> } '
>>>>>>>>>
>>>>>>>>> spdf <- readOGR(js, layer='OGRGeoJSON', verbose=FALSE) # Create a
>>>>>>>>> SpatialPoygonsDataFrame
>>>>>>>>>
>>>>>>>>> temp <- tempfile()
>>>>>>>>> writeOGR(spdf, dsn = temp, layer = "", driver="GeoJSON")
>>>>>>>>> cat(readLines(temp))
>>>>>>>>>
>>>>>>>>> # Output:
>>>>>>>>> { "type": "FeatureCollection", "crs": { "type": "name",
>>>>>>>>> "properties":
>>>>>>>>> { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } }, "features": [ {
>>>>>>>>> "type":
>>>>>>>>> "Feature", "id": 0, "properties": { "FID": 0 }, "geometry": {
>>>>>>>>> "type":
>>>>>>>>> "MultiPolygon", "coordinates": [ [ [ [ 102.0, 2.0 ], [ 102.0, 3.0 ],
>>>>>>>>> [
>>>>>>>>> 103.0, 3.0 ], [ 103.0, 2.0 ], [ 102.0, 2.0 ] ], [ [ 100.0, 0.0 ], [
>>>>>>>>> 100.0, 1.0 ], [ 101.0, 1.0 ], [ 101.0, 0.0 ], [ 100.0, 0.0 ] ] ] ] }
>>>>>>>>> }
>>>>>>>>> ] }
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I can replicate this with GDAL 2.0.1 and rgdal 1.0-7 - reading temp
>>>>>>>> in
>>>>>>>> gives
>>>>>>>> the orphaned hole. I'm surprised that passing the js object to
>>>>>>>> readOGR()
>>>>>>>> doesn't fail, but that isn't the source of the problem. spdf appears
>>>>>>>> to
>>>>>>>> be
>>>>>>>> structured correctly. Please provide your GDAL and rgdal versions,
>>>>>>>> and
>>>>>>>> OS
>>>>>>>> details from sessionInfo().
>>>>>>>>
>>>>>>>> We are completely relying on the GDAL drivers here - we can't cherry
>>>>>>>> pick
>>>>>>>> for particular drivers (historically excepting ESRI Shapefile), so
>>>>>>>> your
>>>>>>>> debugging will need to examine writeOGR(), the C code it calls, and
>>>>>>>> the
>>>>>>>> interactions between the rgdal C code and called OGR functions.
>>>>>>>> Please
>>>>>>>> check
>>>>>>>> which other drivers are affected (I think KML is, is GML?). Can you
>>>>>>>> place
>>>>>>>> debugging Rprintf(...); in the rgdal C code to see where this is
>>>>>>>> coming
>>>>>>>> from?
>>>>>>>>
>>>>>>>> I can start looking at this sometime next week, so please make a
>>>>>>>> start
>>>>>>>> yourself; contributions from others are very welcome.
>>>>>>>>
>>>>>>>> Roger
>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you look closely at the output, you can see that the
>>>>>>>>> 'coordinates'
>>>>>>>>> array now contains a single polygon array with two coordinate
>>>>>>>>> arrays:
>>>>>>>>> the boundary, and a second one which is now treated as a hole of the
>>>>>>>>> first (orphaned as it is outside the bounds of the polygon). The
>>>>>>>>> original 'coordinates' array consists of two polygon arrays, each
>>>>>>>>> consisting of a single coordinate array which defining a polygon
>>>>>>>>> (with
>>>>>>>>> no holes), which is correct according the GeoJSON spec:
>>>>>>>>> http://geojson.org/geojson-spec.html#polygon.
>>>>>>>>>
>>>>>>>>> I'm always hesitant to call things a bug, but this doesn't appear to
>>>>>>>>> happen using ogr2ogr on the command line:
>>>>>>>>>
>>>>>>>>> writeOGR(spdf, ".", "test", driver = "ESRI Shapefile") # Write a
>>>>>>>>> shapefile to convert using ogr2ogr
>>>>>>>>> system("ogr2ogr -f GeoJSON test_from_shp.geojson test.shp")
>>>>>>>>> cat(readLines("test_from_shp.geojson"))
>>>>>>>>>
>>>>>>>>> # Output:
>>>>>>>>> { "type": "FeatureCollection",  "features": [ { "type": "Feature",
>>>>>>>>> "properties": { "FID": 0 }, "geometry": { "type": "MultiPolygon",
>>>>>>>>> "coordinates": [ [ [ [ 102.0, 2.0 ], [ 102.0, 3.0 ], [ 103.0, 3.0 ],
>>>>>>>>> [
>>>>>>>>> 103.0, 2.0 ], [ 102.0, 2.0 ] ] ], [ [ [ 100.0, 0.0 ], [ 100.0, 1.0
>>>>>>>>> ],
>>>>>>>>> [ 101.0, 1.0 ], [ 101.0, 0.0 ], [ 100.0, 0.0 ] ] ] ] } } ] }
>>>>>>>>>
>>>>>>>>> The resulting output is correct.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks in advance for any help on this.
>>>>>>>>>
>>>>>>>>> Andy
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> R-sig-Geo mailing list
>>>>>>>>> R-sig-Geo at r-project.org
>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Roger Bivand
>>>>>>>> Department of Economics, Norwegian School of Economics,
>>>>>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>>>>>> voice: +47 55 95 93 55; fax +47 55 95 91 00
>>>>>>>> e-mail: Roger.Bivand at nhh.no
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Roger Bivand
>>>> Department of Economics, Norwegian School of Economics,
>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>> voice: +47 55 95 93 55; fax +47 55 95 91 00
>>>> e-mail: Roger.Bivand at nhh.no
>>>
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; fax +47 55 95 91 00
>> e-mail: Roger.Bivand at nhh.no
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no


More information about the R-sig-Geo mailing list