[R-sig-Geo] Exporting Multipolygon geojson with rgdal::writeOGR

Forrest Stevens r-sig-geo at forreststevens.com
Mon Nov 2 23:52:15 CET 2015


This is fantastic work, thank you both for troubleshooting it and the fix
Roger. Your tireless work is greatly appreciated! I just ran across this
problem a couple of weeks ago when exporting from R to GeoJSON, and will
happily test out the new version.

Sincerely,
Forrest

On Mon, Nov 2, 2015 at 5:44 PM Andy Teucher <andy.teucher at gmail.com> wrote:

> Amazing (and fast) work Roger, and thanks for the great explanation.
> I can see why you wouldn't want to redo the way Polygons objects are
> structured - I'm sure it would be break a huge amount of existing
> code.
>
> Just curious, does OSX take longer because you have to deal with
> multiple possible GDAL versions?
>
> Thanks again,
> Andy
>
> On Mon, Nov 2, 2015 at 1:21 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
> > Hi Andy,
> >
> > On Mon, 26 Oct 2015, Andy Teucher wrote:
> >
> >> Hi Roger,
> >>
> >> Sorry I haven't been much help here - it's not for lack of trying,
> >> just lack of skills. I've been poring through OGR_write.cpp and trying
> >> to figure out language and the flow of things... a few thoughts
> >> occurred to me (which are probably obvious to you):
> >>
> >
> > Your example was very helpful, the problem was that when writeOGR() and
> the
> > underlying compiled code was written, the model used was that of the ESRI
> > Shapefile driver, rather than OGC SFS, which was first published in 2004.
> > Most older formats (and their drivers in OGR) use a OGR geometry factory
> > function to organize polygons on reading - but this is not present in
> > shapelib (bundled with the R maptools package), so when designing polygon
> > classes for sp, and writing rgdal::writeOGR() about 10 years ago, this
> was
> > not known, and was not revisited subsequently, until you happily spotted
> the
> > bug.
> >
> > GEOS in rgeos uses OGC SFS definitions (the ones you are using below),
> but
> > the SpatialPolygons/Polygons/Polygon structure was already established by
> > then, and a fix (using comment attributes in Polygons objects) was used,
> > rather than a major change to Polygons objects (in 2010).
> >
> > The revision to rgdal::writeOGR() now branches on the input objects - if
> > they have comments (applied in code, not user-facing), these are trusted,
> > and the MultiPolygon objects constructed containing one or more Polygon
> > objects with one exterior ring and zero or more interior rings on that
> > basis. However, if the input objects do not have comment attributes,
> > OGRGeometryFactory::organizePolygons() (C++) is used to create properly
> > organized MultiPolygon objects for export.
> >
> > This means that the pre-OGC SFS drivers continue to work, but that the
> OGC
> > SFS drivers, like GeoJSON, now also work correctly when the second ring
> in
> > an object is not assumed to be an interior ring (rgdal < 1.1-1). A test
> set
> > for all the possible cases has been introduced.
> >
> > The revision is now in source package form on CRAN:
> >
> > https://cran.r-project.org/web/packages/rgdal/index.html
> >
> > and a Windows binary package will be available later on Tuesday I hope.
> The
> > OSX revision may take a little longer, but we are grateful to have what
> we
> > have anyway.
> >
> > Thanks for a clear and very helpful bug report.
> >
> > Best wishes,
> >
> > Roger
> >
> >
> >> 1) Is this bug because holes are standalone polygons in sp objects,
> >> while they are members of polygon objects in geojson, wkt, etc?
> >>
> >> 2) On a related note, I notice that a single polygon with a hole is
> >> identified as a multipolygon containing a single polygon with a hole,
> >> when using writeOGR and one of the affected drivers.  Practically I
> >> don't think this matters too much, but it's not quite right.
> >>
> >> 3) In lines 116-160 the polygon/multipolygon and line/multiline
> >> determination is made for the entire layer. Can/should this be made
> >> for each feature in the layer, as you could have a layer with a
> >> mixture of single polygons and multipolygons?
> >>
> >> 4) I feel like there ought to be one more layer of looping through the
> >> polyons. e.g.:
> >>
> >> loop over each feature;
> >>     determine whether each feature is a single polygon (length 1 or
> >> all rings but one are holes), or a multipolygon:
> >>        if a polygon loop over and add ring(s);
> >>        if a multipolygon loop over polygons:
> >>            for each polygon add ring(s) and somehow assign holes to
> >> parent polygons
> >>
> >> Again, apologies if I'm being totally obvious (or way off base).
> >> Thanks for all your work on these packages, and on this issue.
> >>
> >> Andy
> >>
> >>
> >> On Thu, Oct 22, 2015 at 11:07 PM, Roger Bivand <Roger.Bivand at nhh.no>
> >> wrote:
> >>>
> >>> On Fri, 23 Oct 2015, Andy Teucher wrote:
> >>>
> >>>> On my Mac, I am running rgdal 1.0-7 and GDAL 1.11.3
> >>>>
> >>>>
> >>>> sessionInfo()
> >>>>
> >>>
> >>> OK, thanks.
> >>>
> >>> The problem also affects the SQLite driver (the CRAN binaries do not
> have
> >>> this driver). Could someone please check the PostGIS driver - my guess
> is
> >>> that all OGC SFS compliant drivers may be affected.
> >>>
> >>> Could somebody also please check whether this is a recently introduced
> >>> issue
> >>> or not? For example somebody still on rgdal < 1.0? src/OGR_write.cpp
> was
> >>> changed 2015-08-21, but the last previous change was 2015-06-11, then
> >>> 2015-05-31, 2014-08-17 ... from the ChangeLog visible on the package
> CRAN
> >>> page.
> >>>
> >>> Roger
> >>>
> >>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> R version 3.2.2 (2015-08-14)
> >>>>
> >>>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
> >>>>
> >>>> Running under: OS X 10.11 (El Capitan)
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> locale:
> >>>>
> >>>> [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> attached base packages:
> >>>>
> >>>> [1] stats     graphics  grDevices utils     datasets  methods   base
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> other attached packages:
> >>>>
> >>>> [1] rgdal_1.0-7 sp_1.2-1
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> loaded via a namespace (and not attached):
> >>>>
> >>>> [1] tools_3.2.2     grid_3.2.2      lattice_0.20-33
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Andy
> >>>>
> >>>> On Thu, Oct 22, 2015 at 11:28 AM, Andy Teucher <
> andy.teucher at gmail.com>
> >>>> wrote:
> >>>>
> >>>>> One more test: WKT has the same issue:
> >>>>> library(rgdal)
> >>>>> js <- '{
> >>>>> "type": "MultiPolygon",
> >>>>> "coordinates": [[[[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0,
> >>>>> 3.0],
> >>>>> [102.0, 2.0]]], [[[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0,
> >>>>> 1.0],
> >>>>> [100.0, 0.0]]]]
> >>>>> } '
> >>>>> spdf <- readOGR(js, layer='OGRGeoJSON', verbose=FALSE)
> >>>>> writeOGR(spdf, dsn = "test.csv", layer = "test", driver="CSV",
> >>>>> layer_options = "GEOMETRY=AS_WKT")
> >>>>> cat(readLines("test.csv"), sep = "\n")
> >>>>> WKT,FID,
> >>>>> "MULTIPOLYGON (((102 2,102 3,103 3,103 2,102 2),(100 0,100 1,101
> 1,101
> >>>>> 0,100 0)))",0
> >>>>> It should be:
> >>>>> WKT,FID,
> >>>>> "MULTIPOLYGON (((102 2,102 3,103 3,103 2,102 2)),((100 0,100 1,101
> >>>>> 1,101 0,100 0)))",0
> >>>>> Andy
> >>>>> On Thu, Oct 22, 2015 at 10:36 AM, Andy Teucher <
> andy.teucher at gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>
> >>>>>> Thanks very much Roger.
> >>>>>>
> >>>>>> I've replicated this on my work machine (Windows) and on my Mac at
> >>>>>> home.
> >>>>>>
> >>>>>> KML and GML are both affected. ESRI shapefile works fine and, though
> >>>>>> I'm not very familiar with the format, 'MapInfo File' works too
> >>>>>> (loading the file in qgis shows a multipart polygon with no geometry
> >>>>>> errors).
> >>>>>>
> >>>>>> sessionInfo() for my Windows machine (using rgdal 1.0-7 with
> included
> >>>>>> gdal drivers):
> >>>>>>
> >>>>>> R version 3.2.2 (2015-08-14)
> >>>>>> Platform: i386-w64-mingw32/i386 (32-bit)
> >>>>>> Running under: Windows 7 x64 (build 7601) Service Pack 1
> >>>>>>
> >>>>>> locale:
> >>>>>> [1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
> >>>>>> LC_MONETARY=English_Canada.1252
> >>>>>> [4] LC_NUMERIC=C                    LC_TIME=English_Canada.1252
> >>>>>>
> >>>>>> attached base packages:
> >>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
> >>>>>>
> >>>>>> other attached packages:
> >>>>>> [1] rgdal_1.0-7 sp_1.2-0
> >>>>>>
> >>>>>> loaded via a namespace (and not attached):
> >>>>>> [1] tools_3.2.2     grid_3.2.2      lattice_0.20-33
> >>>>>>
> >>>>>>
> >>>>>> I'll send the details for my Mac environment this evening.
> >>>>>>
> >>>>>> I'll dig into the source code as soon as I can - C is a completely
> >>>>>> foreign language to me, so I can't promise anything fast!
> >>>>>>
> >>>>>> Thanks again,
> >>>>>> Andy
> >>>>>>
> >>>>>> On Thu, Oct 22, 2015 at 12:27 AM, Roger Bivand <Roger.Bivand at nhh.no
> >
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, 22 Oct 2015, Andy Teucher wrote:
> >>>>>>>
> >>>>>>>> I’m finding that writeOGR isn’t exporting multipolygons properly
> >>>>>>>> using
> >>>>>>>> the GeoJSON driver. I have a simple test case (borrowed from here:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> http://gis.stackexchange.com/questions/137977/writeogr-alters-multipolygon-holes
> )
> >>>>>>>> with a geojson string with one multipolygon containing two
> polygons.
> >>>>>>>> I
> >>>>>>>> use readOGR to create a SpatialPolygonsDataFrame out of it, then
> >>>>>>>> write
> >>>>>>>> it with writeOGR:
> >>>>>>>>
> >>>>>>>> library(rgdal)
> >>>>>>>>
> >>>>>>>> js <- '{
> >>>>>>>> "type": "MultiPolygon",
> >>>>>>>> "coordinates": [[[[102.0, 2.0], [103.0, 2.0], [103.0, 3.0],
> [102.0,
> >>>>>>>> 3.0],
> >>>>>>>> [102.0, 2.0]]], [[[100.0, 0.0], [101.0, 0.0], [101.0, 1.0],
> [100.0,
> >>>>>>>> 1.0],
> >>>>>>>> [100.0, 0.0]]]]
> >>>>>>>> } '
> >>>>>>>>
> >>>>>>>> spdf <- readOGR(js, layer='OGRGeoJSON', verbose=FALSE) # Create a
> >>>>>>>> SpatialPoygonsDataFrame
> >>>>>>>>
> >>>>>>>> temp <- tempfile()
> >>>>>>>> writeOGR(spdf, dsn = temp, layer = "", driver="GeoJSON")
> >>>>>>>> cat(readLines(temp))
> >>>>>>>>
> >>>>>>>> # Output:
> >>>>>>>> { "type": "FeatureCollection", "crs": { "type": "name",
> >>>>>>>> "properties":
> >>>>>>>> { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } }, "features": [ {
> >>>>>>>> "type":
> >>>>>>>> "Feature", "id": 0, "properties": { "FID": 0 }, "geometry": {
> >>>>>>>> "type":
> >>>>>>>> "MultiPolygon", "coordinates": [ [ [ [ 102.0, 2.0 ], [ 102.0, 3.0
> ],
> >>>>>>>> [
> >>>>>>>> 103.0, 3.0 ], [ 103.0, 2.0 ], [ 102.0, 2.0 ] ], [ [ 100.0, 0.0 ],
> [
> >>>>>>>> 100.0, 1.0 ], [ 101.0, 1.0 ], [ 101.0, 0.0 ], [ 100.0, 0.0 ] ] ]
> ] }
> >>>>>>>> }
> >>>>>>>> ] }
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> I can replicate this with GDAL 2.0.1 and rgdal 1.0-7 - reading temp
> >>>>>>> in
> >>>>>>> gives
> >>>>>>> the orphaned hole. I'm surprised that passing the js object to
> >>>>>>> readOGR()
> >>>>>>> doesn't fail, but that isn't the source of the problem. spdf
> appears
> >>>>>>> to
> >>>>>>> be
> >>>>>>> structured correctly. Please provide your GDAL and rgdal versions,
> >>>>>>> and
> >>>>>>> OS
> >>>>>>> details from sessionInfo().
> >>>>>>>
> >>>>>>> We are completely relying on the GDAL drivers here - we can't
> cherry
> >>>>>>> pick
> >>>>>>> for particular drivers (historically excepting ESRI Shapefile), so
> >>>>>>> your
> >>>>>>> debugging will need to examine writeOGR(), the C code it calls, and
> >>>>>>> the
> >>>>>>> interactions between the rgdal C code and called OGR functions.
> >>>>>>> Please
> >>>>>>> check
> >>>>>>> which other drivers are affected (I think KML is, is GML?). Can you
> >>>>>>> place
> >>>>>>> debugging Rprintf(...); in the rgdal C code to see where this is
> >>>>>>> coming
> >>>>>>> from?
> >>>>>>>
> >>>>>>> I can start looking at this sometime next week, so please make a
> >>>>>>> start
> >>>>>>> yourself; contributions from others are very welcome.
> >>>>>>>
> >>>>>>> Roger
> >>>>>>>
> >>>>>>>>
> >>>>>>>> If you look closely at the output, you can see that the
> >>>>>>>> 'coordinates'
> >>>>>>>> array now contains a single polygon array with two coordinate
> >>>>>>>> arrays:
> >>>>>>>> the boundary, and a second one which is now treated as a hole of
> the
> >>>>>>>> first (orphaned as it is outside the bounds of the polygon). The
> >>>>>>>> original 'coordinates' array consists of two polygon arrays, each
> >>>>>>>> consisting of a single coordinate array which defining a polygon
> >>>>>>>> (with
> >>>>>>>> no holes), which is correct according the GeoJSON spec:
> >>>>>>>> http://geojson.org/geojson-spec.html#polygon.
> >>>>>>>>
> >>>>>>>> I'm always hesitant to call things a bug, but this doesn't appear
> to
> >>>>>>>> happen using ogr2ogr on the command line:
> >>>>>>>>
> >>>>>>>> writeOGR(spdf, ".", "test", driver = "ESRI Shapefile") # Write a
> >>>>>>>> shapefile to convert using ogr2ogr
> >>>>>>>> system("ogr2ogr -f GeoJSON test_from_shp.geojson test.shp")
> >>>>>>>> cat(readLines("test_from_shp.geojson"))
> >>>>>>>>
> >>>>>>>> # Output:
> >>>>>>>> { "type": "FeatureCollection",  "features": [ { "type": "Feature",
> >>>>>>>> "properties": { "FID": 0 }, "geometry": { "type": "MultiPolygon",
> >>>>>>>> "coordinates": [ [ [ [ 102.0, 2.0 ], [ 102.0, 3.0 ], [ 103.0, 3.0
> ],
> >>>>>>>> [
> >>>>>>>> 103.0, 2.0 ], [ 102.0, 2.0 ] ] ], [ [ [ 100.0, 0.0 ], [ 100.0, 1.0
> >>>>>>>> ],
> >>>>>>>> [ 101.0, 1.0 ], [ 101.0, 0.0 ], [ 100.0, 0.0 ] ] ] ] } } ] }
> >>>>>>>>
> >>>>>>>> The resulting output is correct.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks in advance for any help on this.
> >>>>>>>>
> >>>>>>>> Andy
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> R-sig-Geo mailing list
> >>>>>>>> R-sig-Geo at r-project.org
> >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Roger Bivand
> >>>>>>> Department of Economics, Norwegian School of Economics,
> >>>>>>> Helleveien 30, N-5045 Bergen, Norway.
> >>>>>>> voice: +47 55 95 93 55; fax +47 55 95 91 00
> >>>>>>> e-mail: Roger.Bivand at nhh.no
> >>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>> Roger Bivand
> >>> Department of Economics, Norwegian School of Economics,
> >>> Helleveien 30, N-5045 Bergen, Norway.
> >>> voice: +47 55 95 93 55; fax +47 55 95 91 00
> >>> e-mail: Roger.Bivand at nhh.no
> >>
> >>
> >
> > --
> > Roger Bivand
> > Department of Economics, Norwegian School of Economics,
> > Helleveien 30, N-5045 Bergen, Norway.
> > voice: +47 55 95 93 55; fax +47 55 95 91 00
> > e-mail: Roger.Bivand at nhh.no
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list