[R-sig-Geo] rgdal::writeOGR with driver='ESRI Shapefile' converts Polygon object into a hole

Roger Bivand Roger@B|v@nd @end|ng |rom nhh@no
Sun Aug 18 14:59:10 CEST 2019


The reason that the problem occurred is that a MULTIPOLYGON with two 
exterior rings becomes invalid if the exterior rings touch along an edge 
(this case). It is important to know the use case, to see whether:

library(rgeos)
writeWKT(Ps1)
gIsValid(Ps1, reason=TRUE)
Ps1a <- gUnaryUnion(gBuffer(Ps1, width=0))
gIsValid(Ps1a, reason=TRUE)
writeWKT(Ps1a)

or equivalently in sf for sf objects, should be applied before trying to 
write the object out to file with this driver.

However, because drivers that are compliant with the simple features 
standard (which bans exterior rings sharing edges) have been permissive 
and do round-trip this invalid object, a relaxation in the OGR ESRI 
shapefile driver has been provided and may be included in a future 
release.

We need to know (see these issues):

https://github.com/r-spatial/sf/issues/1130
https://github.com/OSGeo/gdal/issues/1787

why it was desirable to write out this object using this driver? Could an 
alternative driver have been used, or is ESRI shapefile the only format 
used in the workflow?

If it has to be this driver, could the workflow be changed to repair 
degenerate cases before writing? If using sp classes, rgeos may be used to 
test for and probably repair such geometries before they reach 
rgdal::writeOGR() for this driver. Adding code to sf and rgdal to trap 
degenerate cases does encumber all users with valid geometries with 
the time wasted on extra checking, so building checks into sf and rgdal is 
not desirable.

I hope it is possible to find out more about the use case quickly, to pass 
on to GDAL developers to help motivate a relaxation in their current 
policy with regard to this driver, and to encourage them to include the 
fix branch in a future release of GDAL.

Roger

On Sat, 17 Aug 2019, Roger Bivand wrote:

> Please follow up both here and on:
>
> https://github.com/r-spatial/sf/issues/1130
>
> as the problem is also seen in the sf package using the same GDAL ESRI 
> Shapefile driver.
>
> Roger
>
> On Fri, 16 Aug 2019, Roger Bivand wrote:
>
>>  On Fri, 16 Aug 2019, Roger Bivand wrote:
>>
>>>   On Tue, 13 Aug 2019, Phil Haines wrote:
>>>
>>>>    Dear list,
>>>>
>>>>    I have a single Polygons object containing multiple Polygon objects
>>>>    that share a common border. When I output this using writeOGR() one of
>>>>    the Polygon objects becomes a hole, as the following example shows.
>>>>
>>>>    Create a Polygons object containing two adjoining Polygon objects
>>>>>    library(rgdal)
>>>>>    r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>>>    r2 <- r1; r2[,1] <- r2[,1]+1
>>>>>    Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>>>    SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>>>    data.frame(Example=c("Minimal")))
>>>>
>>>>    Perform a write/readOGR() cycle
>>>>>    fn <- tempfile()
>>>>>    writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>>>    SPDF2 <- readOGR(dsn=fn,layer='test')
>>>>
>>>>    Second Polygon object is now a hole
>>>>>    sapply(SPDF2 using polygons[[1]]@Polygons,slot,"hole")
>>>>    [1] FALSE  TRUE
>>>>
>>>>    I see from the sp documentation that "Polygon objects belonging to a
>>>>    Polygons object should either not overlap one-other, or should be
>>>>    fully included" but I am not sure how this relates to bordering
>>>>    Polygon objects. I would welcome any advice as to whether what I am
>>>>    asking of writeOGR is reasonable?
>>>
>>>   The problem is with the 'ESRI Shapefile' representation and driver:
>>>
>>>   library(rgdal)
>>>   r1 <- rbind(c(1,1),c(1,2),c(2,2),c(2,1),c(1,1))
>>>   r2 <- r1; r2[,1] <- r2[,1]+1
>>>   Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
>>>   SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)),
>>>   data.frame(Example=c("Minimal")))
>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>
>>>   # which constructs a MULTIPOLYGON object:
>>>
>>>   rgeos::writeWKT(SPDF)
>>>   library(sf)
>>>   st_as_text(st_geometry(st_as_sf(SPDF)))
>>> 
>>> #    The 'ESRI Shapefile' driver is not Simple-Feature compliant (it 
>>> #    pre-dates it), so the failure occurs by seeing the second exterior 
>>> #    ring
>>> #    as an interior ring
>>>
>>>   fn <- tempfile()
>>>   writeOGR(SPDF, fn, layer='test', driver='ESRI Shapefile')
>>>   SPDF2 <- readOGR(dsn=fn, layer='test')
>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF2, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>   rgeos::writeWKT(SPDF2)
>>>   st_as_text(st_geometry(st_as_sf(SPDF2)))
>>>
>>>   # This happens with sf too, using the same GDAL driver:
>>>
>>>   sf2 <- st_read(dsn=fn, layer='test')
>>>   st_geometry(sf2)
>>>   library(sf)
>>>   st_as_text(st_geometry(st_as_sf(sf2)))
>>>   rgeos::writeWKT(as(sf2, "Spatial"))
>>>
>>>   # Adding the comment fix doesn't help:
>>>
>>>   comment(slot(SPDF, "polygons")[[1]])
>>>   SPDF_c <- rgeos::createSPComment(SPDF)
>>>   comment(slot(SPDF_c, "polygons")[[1]])
>>>   writeOGR(SPDF_c, fn, layer='test_c', driver='ESRI Shapefile',
>>>     verbose=TRUE)
>>> 
>>> #    reports
>>> #    Object initially classed as: wkbPolygon
>>> #    SFS comments in Polygons objects
>>> #    Object reclassed as: wkbMultiPolygon
>>>
>>>   SPDF2_c <- readOGR(dsn=fn, layer='test_c')
>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF2_c, "polygons")[[1]], "Polygons"), slot,
>>>   "ringDir")
>>>   rgeos::writeWKT(SPDF2_c)
>>>   st_as_text(st_geometry(st_as_sf(SPDF2_c)))
>>>
>>> 
>>>
>>>   # If the input object is written out with the GeoPackage driver:
>>>
>>>   fn1 <- tempfile(fileext=".gpkg")
>>>   writeOGR(SPDF, fn1, layer="test", driver='GPKG')
>>>   sf2a <- st_read(dsn=fn1,layer='test')
>>>   st_coordinates(st_geometry(sf2a))
>>>   SPDF2a <- readOGR(dsn=fn1)
>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF2a, "polygons")[[1]], "Polygons"), slot, "ringDir")
>>>   rgeos::writeWKT(SPDF2a)
>>>   st_as_text(st_geometry(st_as_sf(SPDF2a)))
>>>
>>>   # the issue is resolved. If we separate the exterior rings:
>>>
>>>   r2a <- r1; r2a[,1] <- r2a[,1]+1.00001
>>>   Ps1a = Polygons(list(Polygon(r1),Polygon(r2a)),ID=1)
>>>   SPDFa = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1a)),
>>>   data.frame(Example=c("Minimal")))
>>>   fna <- tempfile()
>>>   writeOGR(SPDFa, fna, layer='test', driver='ESRI Shapefile')
>>>   SPDF2_a <- readOGR(dsn=fna, layer='test')
>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot, "hole")
>>>   sapply(slot(slot(SPDF2_a, "polygons")[[1]], "Polygons"), slot,
>>>   "ringDir")
>>>   rgeos::writeWKT(SPDF2_a)
>>>   st_as_text(st_geometry(st_as_sf(SPDF2_a)))
>>>
>>>   # we are OK as the two exterior rings do not touch.
>>>
>>>   # does using sf make a difference?
>>>
>>>   fn_s <- tempfile(fileext=".shp")
>>>   st_write(st_as_sf(SPDF), dsn=fn_s)
>>>   sf_in <- st_read(fn_s)
>>>   st_as_text(st_geometry(st_as_sf(sf_in)))
>>>
>>>   # No
>>>
>>>   fn_s <- tempfile(fileext=".shp")
>>>   st_write(st_as_sf(SPDF_c), dsn=fn_s)
>>>   sf_in_c <- st_read(fn_s)
>>>   st_as_text(st_geometry(st_as_sf(sf_in_c)))
>>>
>>>   # nor with the pretend-SF-compliant comment set either.
>>>
>>>   So the weakness is in the "ESRI Shapefile" write driver, or possibly in
>>>   the OGRGeometryFactory::organizePolygons() function in GDAL used in
>>>   OGR_write() (a C++ function) called by writeOGR(). If sf::st_write()
>>>   also
>>>   calls OGRGeometryFactory::organizePolygons(), we'd maybe consider that
>>>   it
>>>   has a weakness for the "ESRI Shapefile" driver, but which does not
>>>   affect
>>>   SF-compliant drivers.
>>
>>  Without the comment set, OGRGeometryFactory::organizePolygons() is used;
>>  with it set, OGRGeometryFactory::organizePolygons() is not used, because
>>  the object is declared as two exterior rings. In both cases, we have the
>>  output object written out and read back in incorrectly with the ESRI
>>  shapefile driver, but SF-compliant drivers round-trip (in the test
>>  GeoJSON), correctly.
>>
>>  It is likely that the changes made in 2015 to accommodate GeoJSON led to
>>  this possible regression for the ESRI Shapefile driver. I'm adding this
>>  geometry to tests/tripup.R and data files; without code changes the hole
>>  slot is wrong and the ring direction is changes to match, so the
>>  coordinates change too.
>>
>>  Reading using the deprecated maptools::readShapeSpatial() also gets a hole
>>  in the second external ring. However, writing with deprecated 
>> maptools::  writeSpatialShape() yields a shapefile that when read with 
>> maptools::  readShapeSpatial() gives the correct two exterior ring, no hole
>>  object. When read with sf::st_read() and rgdal::readOGR(), the object is
>>  also correct. So the problem definitely lies in rgdal::writeOGR(), and
>>  sf::st_write() - roundtripping with sf::st_write() and sf::st_read()
>>  degrades from MULTIPOLYGON to POLYGON with the ESRI Shapefile driver.
>>
>>  Roger
>> 
>>>
>>>   This is probably related to a similar but inverse problem with the
>>>   SF-compliant GeoJSON driver in 2015:
>>>
>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-October/023609.html
>>>
>>>   continued the next month in:
>>>
>>>   https://stat.ethz.ch/pipermail/r-sig-geo/2015-November/023656.html
>>>
>>>   The details are in this SVN diff
>>>
>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=555&r2=571
>>>
>>>   up to the end og the list thread, and this one from then until now:
>>>
>>>   https://r-forge.r-project.org/scm/viewvc.php/pkg/src/OGR_write.cpp?root=rgdal&r1=571&r2=733
>>>
>>>   Summary: could you change drivers, or is it really necessary to fix an
>>>   EOL
>>>   problem? What is your use case?
>>> 
>>>>
>>>>    Thanks in advance for your time,
>>>
>>>   Thanks for a complete example,
>>>
>>>   Roger
>>>
>>>>    Phil
>>>>
>>>>>    sessionInfo()
>>>>    R version 3.5.1 (2018-07-02)
>>>>    Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>>    Running under: Windows >= 8 x64 (build 9200)
>>>>
>>>>    Matrix products: default
>>>>
>>>>    locale:
>>>>    [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
>>>>    Kingdom.1252    LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
>>>>                            LC_TIME=English_United Kingdom.1252
>>>>
>>>>    attached base packages:
>>>>    [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>>    other attached packages:
>>>>    [1] rgdal_1.4-4 sp_1.3-1
>>>>
>>>>    loaded via a namespace (and not attached):
>>>>    [1] compiler_3.5.1  tools_3.5.1     yaml_2.2.0      grid_3.5.1
>>>>    lattice_0.20-35
>>>>
>>>>    _______________________________________________
>>>>    R-sig-Geo mailing list
>>>>    R-sig-Geo using r-project.org
>>>>    https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>
>>> 
>> 
>> 
>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand using nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en



More information about the R-sig-Geo mailing list