[R-sig-Geo] Cleaning up self-intersections

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Sat Dec 15 01:04:43 CET 2012


On Fri, Dec 14, 2012 at 8:16 PM, Lyndon Estes <lestes at princeton.edu> wrote:

> Given the type of work we are asking of people, we are getting many
> "unclean" polygons (self-intersects, overlaps, non-noding
> intersections, etc).  I have been writing in various patches to deal
> with these as I have been going, but I turn now for some advice on
> cleaning up operations. An example taken from the workflow illustrates
> a typical problem I encounter that I am looking for an efficient way
> to handle. This example is followed by my two questions.

> So, here''s my first question.  Is there some other of cleaning
> self-intersections up that preserves the area better than the examples
> using gBuffer and gSimplify, and which might be more efficient and
> provide a results that more closely approximates the original map than
> rasterizing and polygonizing? e.g. Is it possible to explode polygons
> at the point(s) of self-intersection, in order to create multiple
> valid polygons?
>
> My second question: am I barking up the wrong tree here?  Should I
> preferentially use something like GRASS's v.clean before I even read
> into R?  I haven't until now because 1) I don't know GRASS very well,
> 2) I want to minimize the number of different routines/software in the
> workflow (which currently involves python, postgis/postgres, R,
> openlayers).

 The most promising thing I've seen for cleaning up polygon topology
is pprepair:

https://github.com/tudelft-gist/pprepair


which won a best paper prize at OSGIS-UK this year. Its not simple to
run properly, which is why I say most promising, and not best.

If run on your extreme self-intersection 'bow tie' polygon, it
produces a shapefile with a feature for each part of the bow tie. I
can't currently find a way of tieing those two features back to the
source bow-tie feature since the attributes aren't propogating
properly. Also, it seems to merge two of the large features in your
examples. Its a standalone C++ program that needs the CGAL library to
work, and there's no R interface. It operates on Shapefiles (or
possibly any OGR source?).

Overall, cleaning up messy digitizing is a hard problem. Obviously
overlaps between polygons are wrong, and pprepair does a great job of
assigning them to one or other of the polygons, but gaps are trickier
to assign - they might be real gaps (like a river between regions) or
they could be digiitzing errors. Its possible that some buffering
could help here, before doing pprepair.

Barry



More information about the R-sig-Geo mailing list