[R-sig-Geo] sp::disaggregate() does not seem to add the necessary "comment" attribute to identify holes if Polygons does not contain exactly 1 polygon with 0 or 1 holes

Anne C. Hanna orion at ofb.net
Mon May 1 20:42:37 CEST 2017


Roger,

Works for me now.  Thanks for your patience with this.

I'll definitely look at sf for future projects, but for now, I just want to
analyze my darn data!  :)

 - Anne


On 05/01/2017 05:23 AM, Roger Bivand wrote:
> On Sun, 30 Apr 2017, Anne C. Hanna wrote:
> 
>> Okay, I think I've tracked it down.  This is... absurdly complicated and I
>> don't know what the actual correct resolution is, but I'm pretty sure that at
>> least I know what's going wrong.
> 
> Good, and thanks for staying with this. I've pushed a fix to my github repo,
> and pull request #29 to edzer/sp.
> 
> Note that this also works out of the box:
> 
> library(sf)
> sf <- st_read(dsn = "2011 Voting District Boundary Shapefiles", layer =
> "MCDS", stringsAsFactors = FALSE)
> sfd <- st_cast(sf, "POLYGON")
> 
> However, using sf::st_touches() is a bit less intuitive for now than the spdep
> "nb" object generated by spdep::poly2nb(), which also provides Queen (touch at
> one or more boundary points) and Rook (touch at more than one boundary point),
> and sf::st_touches() is seven times slower than spdep::poly2nb() now.
> sf::st_touches() could be speeded up using an STR tree (spdep::poly2nb() can
> do this, but in any case only looks at candidate neighbours with intersecting
> bounding boxes). So:
> 
> sp_sfd <- as(sfd, "Spatial")
> library(spdep)
> sp_nb <- poly2nb(sp_sfd)
> 
> seems a reasonable workflow to find the adjacencies under Queen/Rook and
> snapping control.
> 
> Please try out the disaggregate fix and report back - and consider shifting to
> sf.
> 
> Roger
> 
>>
>> Basically it comes down to the fact that createSPComment() does not actually
>> check whether each individual polygon in the SpatialPolygons object has a
>> comment (much less a correct comment).  Instead, it looks at a top-level
>> "comment" attribute.  If that top-level comment attribute (i.e. sf at comment for
>> my SpatialPolygonsDataFrame object stored as "sf") is "TRUE", then
>> createSPComment() does nothing.
>>
>> And here's where the second layer of the problem comes in for my dataset ---
>> my original data, pre-disaggregation, has complete and correct comments.  Some
>> of those comments (for objects where it's a single polygon with 0 or 1 holes)
>> are being retained by the basic disaggregation algorithm, while others are
>> not.  Your patch attempts to fill in the holes by running createSPComment() on
>> the final results.  But... the thing you are running createSPComment() on is
>> SpatialPolygons(p), where p is the list of disaggregated polygons.  And the
>> SpatialPolygons() constructor looks at the list of the polygons you feed it,
>> and if *any* of those polygons have comments, it sets the top-level "comment"
>> attribute of its output to "TRUE", even if other polygons are missing
>> comments.  (And it also doesn't check the comments it gets for correctness.)
>> So the fact that some polygon comments are retained in p means that
>> SpatialPolygons() lies to createSPComment() about whether it has any missing
>> comments that need to be fixed.
>>
>> On the other hand, when I manually run createPolygonComment() on the
>> individual Polygons objects in disaggregate()'s output, I don't even look at
>> the top-level "comment" attribute, so it works just fine.
>>
>> My little toy test polygons, on the other hand, were just not complex enough
>> to end up with partial comments in them, so they also didn't experience this
>> error.
>>
>> I am not familiar enough with the design philosophy behind sp to suggest at
>> what level this issue should be fixed, but I hope this is at least enough info
>> to help those who do know what they're doing get it corrected.
>>
>> - Anne
>>
>>
>> On 04/30/2017 12:18 PM, Anne C. Hanna wrote:
>>> Roger,
>>>
>>> Unfortunately I have a use case for you of createSPComment() not working.
>>>
>>> I tried your new version of disaggregate() on the original test script I sent
>>> you, and it does seem to have fixed all the cases in there.  However, when I
>>> tried it on my actual data, it had the same failure mode as before. The
>>> original dataset has what appear to be correct comments, but the
>>> disaggregate() output is full of NULLs in place of comments for the multi-part
>>> and multi-hole polygons.
>>>
>>> createSPComment() appears to be what's failing, rather than just some logic in
>>> your disaggregate() function --- when I try to run createSPComment() on the
>>> disaggregate() output, I still get the same set of NULLs.  I can fix it if I
>>> run createPolygonsComment() individually on every Polygons object in my
>>> SpatialPolygonsDataFrame.
>>>
>>> I'm not sure what is different between my dataset and the test polygons I was
>>> using earlier, as createSPComment() seems to handle the test shapes just fine
>>> (and converting my SpatialPolygonsDataFrame into just a set of SpatialPolygons
>>> also doesn't help).  But presumably it is better to have createSPComment()
>>> fixed than to have to work around it in disaggregate().  So I guess I'll look
>>> at the code for that and see if I can find what's wrong.
>>>
>>> Just in case you want to see this in action, I've attached a little test
>>> script.  The data set I am working with is available at:
>>>
>>> http://aws.redistricting.state.pa.us/Redistricting/Resources/GISData/2011-Voting-District-Boundary-Shapefiles.zip
>>>
>>>
>>>  - Anne
>>>
>>>
>>> On 04/29/2017 10:06 AM, Roger Bivand wrote:
>>>> On Sat, 29 Apr 2017, Anne C. Hanna wrote:
>>>>
>>>>> Roger,
>>>>>
>>>>> This looks great, and I will try it out ASAP.  I do have one reservation
>>>>> though --- it seems you are using createSPComment() to reconstruct the
>>>>> comments, and I have seen some discussion that that may not be reliable in
>>>>> all cases (e.g. if the initial polygons are wonky in some way).  I don't
>>>>> know a lot about this, but is it possible that it would be preferable to
>>>>> parse and appropriately disaggregate the original comments strings (if they
>>>>> exist), so as to deal slightly more smoothly with such cases (i.e., the
>>>>> polygons would still be wonky, but at least the hole/polygon matching would
>>>>> track with whatever was in the original data)?
>>>>
>>>> Contributions very welcome - this was code moved from the raster package,
>>>> so I
>>>> really have little feeling for why it is necessary.
>>>>
>>>> Please provide references and use cases. GEOS is strict in its treatments of
>>>> geometries, but does work in most cases. People just expressing opinions
>>>> doesn't help, and may well be misleading. It is possible to clean geometries
>>>> too, see the cleangeo package. createPolygonsComment probably isn't
>>>> foolproof,
>>>> but specific reports help, because then it is possible to do something about
>>>> it. sp classes didn't use simple features when written because sf was not
>>>> widely used then - introduced in 2004, when sp was being developed. Using sf
>>>> helps, and rgeos::createPolygonsComment was a work-around from 2010 when
>>>> rgeos
>>>> was written.
>>>>
>>>> Even when you use sf in the sf package, you will still run into invalid
>>>> geometries because the geometries are in fact invalid, the sf package also
>>>> uses GEOS.
>>>>
>>>> Roger
>>>>
>>>>>
>>>>> I also had no success using createSPComment to fix disaggregate()'s output
>>>>> previously, even though my polygons are perfectly non-wonky, so I am perhaps
>>>>> a little more untrusting of it than I should be.  But I'll let you know how
>>>>> this version works with my data.  Thanks for addressing this so quickly!
>>>>>
>>>>> - Anne
>>>>>
>>>>>
>>>>> On 04/28/2017 12:11 PM, Roger Bivand wrote:
>>>>>> I've pushed the fix to my fork:
>>>>>>
>>>>>> https://github.com/rsbivand/sp
>>>>>>
>>>>>> and created a pull request:
>>>>>>
>>>>>> https://github.com/edzer/sp/pull/28
>>>>>>
>>>>>> Only one part of a complicated set if nested if() in disaggregate() was
>>>>>> adding
>>>>>> comments, but in some settings the existing comments survived the
>>>>>> disaggregation. Now the Polygons object comment attribute is re-created for
>>>>>> all Polygons objects. This is version 1.2-6, also including code changes
>>>>>> that
>>>>>> internally affect rgdal and rgeos - you may like to re-install them from
>>>>>> source after installing this sp version (shouldn't matter).
>>>>>>
>>>>>> Roger
>>>>>>
>>>>>> On Fri, 28 Apr 2017, Anne C. Hanna wrote:
>>>>>>
>>>>>>> Hello.  I first posted this issue report on the sp GitHub repo
>>>>>>> (https://github.com/edzer/sp/issues/27) and it was suggested that I
>>>>>>> redirect
>>>>>>> it to here.
>>>>>>>
>>>>>>> I am working with a geographic dataset with complex borders. The data is
>>>>>>> stored as a SpatialPolygonsDataFrame.  Each Polygons object in the data
>>>>>>> frame
>>>>>>> may be composed of multiple disjoint polygons, and each polygon may have
>>>>>>> multiple holes.  I want to disaggregate each of the Polygons objects
>>>>>>> into its
>>>>>>> individual disjoint polygons and construct an adjacency matrix for all the
>>>>>>> disjoint components, and I was using disaggregate() to do this.  However,
>>>>>>> when
>>>>>>> I run gTouches() on the disaggregated data, in order to compute the
>>>>>>> adjacencies, I get a number of warnings like this:
>>>>>>>
>>>>>>> Error in RGEOSBinPredFunc(spgeom1, spgeom2, byid, "rgeos_touches") :
>>>>>>> Polygons
>>>>>>> object missing comment attribute ignoring hole(s).  See function
>>>>>>> createSPComment.
>>>>>>>
>>>>>>> Looking at the Polygons "comment" attributes in the
>>>>>>> SpatialPolygonsDataFrame
>>>>>>> output by disaggregate(), I see that the only comment values are "0"
>>>>>>> (indicating a single polygon with no holes), "0 1" (indicating a single
>>>>>>> polygon with a single hole), and NULL (apparently no comment was written).
>>>>>>> Since I know my dataset contains several Polygons objects which are
>>>>>>> composed
>>>>>>> of multiple disjoint regions, and also several Polygons which contain more
>>>>>>> than one hole, this is not the expected result.  In reading the
>>>>>>> disaggregate()
>>>>>>> code in the sp GitHub repository (specifically, explodePolygons()), I also
>>>>>>> can't see anywhere the comment is being added for the cases where a
>>>>>>> Polygons
>>>>>>> object has more than two parts or more than two holes.  It actually seems
>>>>>>> like
>>>>>>> it's getting carried along almost accidentally in the few cases that do
>>>>>>> get
>>>>>>> comments, and neglected otherwise.
>>>>>>>
>>>>>>> Assuming I'm not failing to understand the code and the desired behavior
>>>>>>> (entirely possible, as I am new at working with this software!), this
>>>>>>> seems
>>>>>>> suboptimal to me.  My dataset is pretty well-behaved (despite its
>>>>>>> complexity),
>>>>>>> so I should be able to fix my issues with judicious application of
>>>>>>> createPolygonsComment.  But I had a heck of a time figuring out what was
>>>>>>> going
>>>>>>> wrong with gTouches, since Polygons comment management appears to be a
>>>>>>> pretty
>>>>>>> obscure field (and createSPComment wasn't working for me, for whatever
>>>>>>> reason).  So it seems like it might be better if disaggregate() just
>>>>>>> parses
>>>>>>> and passes along the comments from its input correctly, or, if it's
>>>>>>> absolutely
>>>>>>> necessary to not create comments, passes nothing and warns clearly in the
>>>>>>> manual that comments and associated hole information are being lost. 
>>>>>>> Passing
>>>>>>> along comments in some cases while silently dropping them in others seems
>>>>>>> like
>>>>>>> kind of the worst of both worlds.
>>>>>>>
>>>>>>> I've attached a set of tests I wrote to demonstrate desired/undesired
>>>>>>> behavior: disaggregate_comment_tests.R.  My R version is 3.4.0, my sp
>>>>>>> version
>>>>>>> is 1.2-4, my rgeos version is 0.3-23 (SVN revision 546), and my GEOS
>>>>>>> runtime
>>>>>>> version is 3.5.1-CAPI-1.9.1 r4246.  I am using Debian Release 9.0 with
>>>>>>> kernel
>>>>>>> version 4.9.0-2-amd64.  I hope this is useful; please let me know if
>>>>>>> you need
>>>>>>> more info or if there is a better place to post this issue.
>>>>>>>
>>>>>>> - Anne
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: OpenPGP digital signature
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20170501/fdf0ed47/attachment.sig>


More information about the R-sig-Geo mailing list