[R-sig-Geo] sp::disaggregate() does not seem to add the necessary "comment" attribute to identify holes if Polygons does not contain exactly 1 polygon with 0 or 1 holes

Roger Bivand Roger.Bivand at nhh.no
Mon May 1 11:23:07 CEST 2017


On Sun, 30 Apr 2017, Anne C. Hanna wrote:

> Okay, I think I've tracked it down.  This is... absurdly complicated and I
> don't know what the actual correct resolution is, but I'm pretty sure that at
> least I know what's going wrong.

Good, and thanks for staying with this. I've pushed a fix to my github 
repo, and pull request #29 to edzer/sp.

Note that this also works out of the box:

library(sf)
sf <- st_read(dsn = "2011 Voting District Boundary Shapefiles", layer = 
"MCDS", stringsAsFactors = FALSE)
sfd <- st_cast(sf, "POLYGON")

However, using sf::st_touches() is a bit less intuitive for now than the 
spdep "nb" object generated by spdep::poly2nb(), which also provides Queen 
(touch at one or more boundary points) and Rook (touch at more than one 
boundary point), and sf::st_touches() is seven times slower than 
spdep::poly2nb() now. sf::st_touches() could be speeded up using an STR 
tree (spdep::poly2nb() can do this, but in any case only looks at 
candidate neighbours with intersecting bounding boxes). So:

sp_sfd <- as(sfd, "Spatial")
library(spdep)
sp_nb <- poly2nb(sp_sfd)

seems a reasonable workflow to find the adjacencies under Queen/Rook and 
snapping control.

Please try out the disaggregate fix and report back - and consider 
shifting to sf.

Roger

>
> Basically it comes down to the fact that createSPComment() does not actually
> check whether each individual polygon in the SpatialPolygons object has a
> comment (much less a correct comment).  Instead, it looks at a top-level
> "comment" attribute.  If that top-level comment attribute (i.e. sf at comment for
> my SpatialPolygonsDataFrame object stored as "sf") is "TRUE", then
> createSPComment() does nothing.
>
> And here's where the second layer of the problem comes in for my dataset ---
> my original data, pre-disaggregation, has complete and correct comments.  Some
> of those comments (for objects where it's a single polygon with 0 or 1 holes)
> are being retained by the basic disaggregation algorithm, while others are
> not.  Your patch attempts to fill in the holes by running createSPComment() on
> the final results.  But... the thing you are running createSPComment() on is
> SpatialPolygons(p), where p is the list of disaggregated polygons.  And the
> SpatialPolygons() constructor looks at the list of the polygons you feed it,
> and if *any* of those polygons have comments, it sets the top-level "comment"
> attribute of its output to "TRUE", even if other polygons are missing
> comments.  (And it also doesn't check the comments it gets for correctness.)
> So the fact that some polygon comments are retained in p means that
> SpatialPolygons() lies to createSPComment() about whether it has any missing
> comments that need to be fixed.
>
> On the other hand, when I manually run createPolygonComment() on the
> individual Polygons objects in disaggregate()'s output, I don't even look at
> the top-level "comment" attribute, so it works just fine.
>
> My little toy test polygons, on the other hand, were just not complex enough
> to end up with partial comments in them, so they also didn't experience this
> error.
>
> I am not familiar enough with the design philosophy behind sp to suggest at
> what level this issue should be fixed, but I hope this is at least enough info
> to help those who do know what they're doing get it corrected.
>
> - Anne
>
>
> On 04/30/2017 12:18 PM, Anne C. Hanna wrote:
>> Roger,
>>
>> Unfortunately I have a use case for you of createSPComment() not working.
>>
>> I tried your new version of disaggregate() on the original test script I sent
>> you, and it does seem to have fixed all the cases in there.  However, when I
>> tried it on my actual data, it had the same failure mode as before. The
>> original dataset has what appear to be correct comments, but the
>> disaggregate() output is full of NULLs in place of comments for the multi-part
>> and multi-hole polygons.
>>
>> createSPComment() appears to be what's failing, rather than just some logic in
>> your disaggregate() function --- when I try to run createSPComment() on the
>> disaggregate() output, I still get the same set of NULLs.  I can fix it if I
>> run createPolygonsComment() individually on every Polygons object in my
>> SpatialPolygonsDataFrame.
>>
>> I'm not sure what is different between my dataset and the test polygons I was
>> using earlier, as createSPComment() seems to handle the test shapes just fine
>> (and converting my SpatialPolygonsDataFrame into just a set of SpatialPolygons
>> also doesn't help).  But presumably it is better to have createSPComment()
>> fixed than to have to work around it in disaggregate().  So I guess I'll look
>> at the code for that and see if I can find what's wrong.
>>
>> Just in case you want to see this in action, I've attached a little test
>> script.  The data set I am working with is available at:
>>
>> http://aws.redistricting.state.pa.us/Redistricting/Resources/GISData/2011-Voting-District-Boundary-Shapefiles.zip
>>
>>  - Anne
>>
>>
>> On 04/29/2017 10:06 AM, Roger Bivand wrote:
>>> On Sat, 29 Apr 2017, Anne C. Hanna wrote:
>>>
>>>> Roger,
>>>>
>>>> This looks great, and I will try it out ASAP.  I do have one reservation
>>>> though --- it seems you are using createSPComment() to reconstruct the
>>>> comments, and I have seen some discussion that that may not be reliable in
>>>> all cases (e.g. if the initial polygons are wonky in some way).  I don't
>>>> know a lot about this, but is it possible that it would be preferable to
>>>> parse and appropriately disaggregate the original comments strings (if they
>>>> exist), so as to deal slightly more smoothly with such cases (i.e., the
>>>> polygons would still be wonky, but at least the hole/polygon matching would
>>>> track with whatever was in the original data)?
>>>
>>> Contributions very welcome - this was code moved from the raster package, so I
>>> really have little feeling for why it is necessary.
>>>
>>> Please provide references and use cases. GEOS is strict in its treatments of
>>> geometries, but does work in most cases. People just expressing opinions
>>> doesn't help, and may well be misleading. It is possible to clean geometries
>>> too, see the cleangeo package. createPolygonsComment probably isn't foolproof,
>>> but specific reports help, because then it is possible to do something about
>>> it. sp classes didn't use simple features when written because sf was not
>>> widely used then - introduced in 2004, when sp was being developed. Using sf
>>> helps, and rgeos::createPolygonsComment was a work-around from 2010 when rgeos
>>> was written.
>>>
>>> Even when you use sf in the sf package, you will still run into invalid
>>> geometries because the geometries are in fact invalid, the sf package also
>>> uses GEOS.
>>>
>>> Roger
>>>
>>>>
>>>> I also had no success using createSPComment to fix disaggregate()'s output
>>>> previously, even though my polygons are perfectly non-wonky, so I am perhaps
>>>> a little more untrusting of it than I should be.  But I'll let you know how
>>>> this version works with my data.  Thanks for addressing this so quickly!
>>>>
>>>> - Anne
>>>>
>>>>
>>>> On 04/28/2017 12:11 PM, Roger Bivand wrote:
>>>>> I've pushed the fix to my fork:
>>>>>
>>>>> https://github.com/rsbivand/sp
>>>>>
>>>>> and created a pull request:
>>>>>
>>>>> https://github.com/edzer/sp/pull/28
>>>>>
>>>>> Only one part of a complicated set if nested if() in disaggregate() was adding
>>>>> comments, but in some settings the existing comments survived the
>>>>> disaggregation. Now the Polygons object comment attribute is re-created for
>>>>> all Polygons objects. This is version 1.2-6, also including code changes that
>>>>> internally affect rgdal and rgeos - you may like to re-install them from
>>>>> source after installing this sp version (shouldn't matter).
>>>>>
>>>>> Roger
>>>>>
>>>>> On Fri, 28 Apr 2017, Anne C. Hanna wrote:
>>>>>
>>>>>> Hello.  I first posted this issue report on the sp GitHub repo
>>>>>> (https://github.com/edzer/sp/issues/27) and it was suggested that I redirect
>>>>>> it to here.
>>>>>>
>>>>>> I am working with a geographic dataset with complex borders. The data is
>>>>>> stored as a SpatialPolygonsDataFrame.  Each Polygons object in the data frame
>>>>>> may be composed of multiple disjoint polygons, and each polygon may have
>>>>>> multiple holes.  I want to disaggregate each of the Polygons objects into its
>>>>>> individual disjoint polygons and construct an adjacency matrix for all the
>>>>>> disjoint components, and I was using disaggregate() to do this.  However,
>>>>>> when
>>>>>> I run gTouches() on the disaggregated data, in order to compute the
>>>>>> adjacencies, I get a number of warnings like this:
>>>>>>
>>>>>> Error in RGEOSBinPredFunc(spgeom1, spgeom2, byid, "rgeos_touches") : Polygons
>>>>>> object missing comment attribute ignoring hole(s).  See function
>>>>>> createSPComment.
>>>>>>
>>>>>> Looking at the Polygons "comment" attributes in the SpatialPolygonsDataFrame
>>>>>> output by disaggregate(), I see that the only comment values are "0"
>>>>>> (indicating a single polygon with no holes), "0 1" (indicating a single
>>>>>> polygon with a single hole), and NULL (apparently no comment was written).
>>>>>> Since I know my dataset contains several Polygons objects which are composed
>>>>>> of multiple disjoint regions, and also several Polygons which contain more
>>>>>> than one hole, this is not the expected result.  In reading the
>>>>>> disaggregate()
>>>>>> code in the sp GitHub repository (specifically, explodePolygons()), I also
>>>>>> can't see anywhere the comment is being added for the cases where a Polygons
>>>>>> object has more than two parts or more than two holes.  It actually seems
>>>>>> like
>>>>>> it's getting carried along almost accidentally in the few cases that do get
>>>>>> comments, and neglected otherwise.
>>>>>>
>>>>>> Assuming I'm not failing to understand the code and the desired behavior
>>>>>> (entirely possible, as I am new at working with this software!), this seems
>>>>>> suboptimal to me.  My dataset is pretty well-behaved (despite its
>>>>>> complexity),
>>>>>> so I should be able to fix my issues with judicious application of
>>>>>> createPolygonsComment.  But I had a heck of a time figuring out what was
>>>>>> going
>>>>>> wrong with gTouches, since Polygons comment management appears to be a pretty
>>>>>> obscure field (and createSPComment wasn't working for me, for whatever
>>>>>> reason).  So it seems like it might be better if disaggregate() just parses
>>>>>> and passes along the comments from its input correctly, or, if it's
>>>>>> absolutely
>>>>>> necessary to not create comments, passes nothing and warns clearly in the
>>>>>> manual that comments and associated hole information are being lost.  Passing
>>>>>> along comments in some cases while silently dropping them in others seems
>>>>>> like
>>>>>> kind of the worst of both worlds.
>>>>>>
>>>>>> I've attached a set of tests I wrote to demonstrate desired/undesired
>>>>>> behavior: disaggregate_comment_tests.R.  My R version is 3.4.0, my sp version
>>>>>> is 1.2-4, my rgeos version is 0.3-23 (SVN revision 546), and my GEOS runtime
>>>>>> version is 3.5.1-CAPI-1.9.1 r4246.  I am using Debian Release 9.0 with kernel
>>>>>> version 4.9.0-2-amd64.  I hope this is useful; please let me know if you need
>>>>>> more info or if there is a better place to post this issue.
>>>>>>
>>>>>> - Anne
>>>>>>
>>>>>
>>>>
>>>>
>>>
>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en



More information about the R-sig-Geo mailing list