[R-sig-Geo] gUnaryUnion Not Dissolving Correctly

Roger Bivand Roger.Bivand at nhh.no
Thu Nov 5 07:45:20 CET 2015


On Wed, 4 Nov 2015, Matt Strimas-Mackey wrote:

> The original study area shapefile is a boundary of the Indonesia half
> of New Guinea. The file as well as the code to construct the hexagonal
> grids are here:
> https://www.dropbox.com/sh/ff8v08p3ambqcbs/AAAPBlGP4fthdmZhrto7oIuCa?dl=0
>
> Since it's a large area, generating the grid takes a long time, so
> I've also included code for a small subset of the original
> shapefile--one small offshore island.

An equivalent scaling fix is to change the coordinate units from metres to 
kilometres:

library(rgdal)
study_area2 <- spTransform(study_area, CRS("+proj=aea +lat_1=-7.42
  +lat_2=-0.62 +lat_0=-9.119 +lon_0=128.91 +x_0=0 y_0=0 +datum=WGS84
  +units=km +no_defs +ellps=WGS84 +towgs84=0,0,0"))
size <- sqrt(2 * 1e2 / sqrt(3)) # reduce to suit
set.seed(1)
hex_points <- spsample(study_area2[2,], type="hexagonal", cellsize=size)
hex_grid <- HexPoints2SpatialPolygons(hex_points)
hex_union <- gUnaryUnion(hex_grid)
plot(hex_grid, col='lightgrey')
plot(hex_union, border='orange', lwd=3, add=T)

An advantage of these scaling issues being made visible is that we see 
that computers really do not operate in continuous space, and that 
computational geometry actually matters, I suppose.

I'll check the underlying code generating the hexagons to see why the 
nodes (where hexagon boundaries meet) appear to slide apart at the edge of 
machine precision.

Roger


>
> Finally, some more odd behaviour. I noticed each time I run this code,
> the dissolve mistakes change, i.e. different boundaries are
> erroneously kept. However, using set.seed() makes the errors the same
> each time for a given seed, and changing the seed yields a different
> set of errors. Example in the code in the dropbox link and copied
> here:
>
> library(sp)
> library(raster)
> library(rgeos)
>
> # just a subset of full shapefile
> set.seed(1)
> size <- sqrt(2 * 1e8 / sqrt(3))
> study_area <- shapefile('papua.shp')
> hex_points <- spsample(study_area[2,], type="hexagonal", cellsize=size)
> hex_grid <- HexPoints2SpatialPolygons(hex_points)
> hex_union <- gUnaryUnion(hex_grid)
> plot(hex_grid, col='lightgrey')
> plot(hex_union, border='orange', lwd=3, add=T)
>
> # results chage according to seed
> set.seed(100)
> size <- sqrt(2 * 1e8 / sqrt(3))
> study_area <- shapefile('papua.shp')
> hex_points <- spsample(study_area[2,], type="hexagonal", cellsize=size)
> hex_grid <- HexPoints2SpatialPolygons(hex_points)
> hex_union <- gUnaryUnion(hex_grid)
> plot(hex_grid, col='lightgrey')
> plot(hex_union, border='orange', lwd=3, add=T)
>
> On Wed, Nov 4, 2015 at 8:57 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>> On Wed, 4 Nov 2015, Matt Strimas-Mackey wrote:
>>
>>> Thanks, lots of useful info here. I've never seen the setScale()
>>> function; I don't think it's mentioned in the gUnaryUnion help. This
>>> saves me a lot of headache!
>>>
>>> For what it's worth, the invalid geometry is an artifact of the
>>> reproducible example I created. The original hexagonal grid is
>>> produced with
>>> g <- spsample(study_area, type="hexagonal", cellsize=size)
>>> hex_grid <- HexPoints2SpatialPolygons(g)
>>
>>
>> OK, thanks - this is useful. Could you please make available the study area
>> object in some way - so that we can re-create g and see how
>> HexPoints2SpatialPolygons() creates the artefact Mike spotted (although this
>> looks like numeric fuzz - 'changing "1287248.96712942" to
>> "1287248.96712943"' is a change in the 15th digit, which is on the precision
>> edge of the "double" storage mode. If we can revisit functions creating
>> SpatialPolygons objects to ensure that they are GEOS-compatible for the
>> default scale of 1e+8, we'll be more secure.
>>
>> Roger
>>
>>
>>>
>>> And this object passes gIsValid() and clgeo_GeometryReport() without
>>> any problems, yet still has the dissolving issue. Regardless, all is
>>> solved with setScale().
>>>
>>> Thanks!
>>>
>>> M
>>>
>>> On Wed, Nov 4, 2015 at 6:30 AM, Michael Sumner <mdsumner at gmail.com> wrote:
>>>>
>>>>
>>>>
>>>> On Thu, 5 Nov 2015 at 01:10 Roger Bivand <Roger.Bivand at nhh.no> wrote:
>>>>>
>>>>>
>>>>> On Wed, 4 Nov 2015, Michael Sumner wrote:
>>>>>
>>>>>> Thanks for all this detail Roger, is there a way to "re-build" a
>>>>>> spatial
>>>>>> object so that the given scale setting is applied? Are there any
>>>>>> general
>>>>>> rounding or "orthogonalize" functions in the Spatial suite?
>>>>>
>>>>>
>>>>> No, not really. In this case, the very detailed coordinate measurements
>>>>> may have made things worse, or possibly using Polygon rather than
>>>>> Polygons
>>>>> objects, or not building the Polygons object with a proper comment
>>>>> attribute, I don't know. rgeos::gIsValid(spoly) is FALSE, and
>>>>>
>>>>>
>>>>> cleangeo::clgeo_GeometryReport(clgeo_CollectionReport(clgeo_Clean(spoly)))
>>>>>
>>>>> says the same (based on rgeos). I suggest working with Emmanuel Blondel
>>>>> (cleangeo maintainer) to extend cleangeo. GEOS is looking at allowing
>>>>> users to manipulate precision models, not just scale, but I'm uncertain
>>>>> about that.
>>>>>
>>>>> Running spoly into GRASS and back out (GRASS builds topology on import)
>>>>> shows a different error, the object seems to be problematic.
>>>>>
>>>>
>>>> It can be fixed by changing "1287248.96712942" to "1287248.96712943", so
>>>> really the creator should not have had those values on input.  There's no
>>>> easy way out without topology.
>>>>
>>>> This brought out some interesting issues for work I've been doing using
>>>> the
>>>> "near-Delaunay triangulation" in RTriangle, and that requires a
>>>> normalized
>>>> set of vertices (no duplicate vertices) which on its own presents
>>>> interesting problems. I have a related issue where a parallel (latitude)
>>>> line needs many vertices to look "smooth" on a polar projection, but when
>>>> building a mesh with triangles it's really best to allow relatively
>>>> coarse
>>>> segmented boundaries rather than have many elements at parallels. The
>>>> Triangle library does not consider these hexagon coordinates to be
>>>> duplicates, so there are two vertical segments between the two bottom
>>>> polys
>>>> at
>>>>
>>>>  points(coordinates(as(as(spoly, "SpatialLines"), "SpatialPoints"))[c(3,
>>>> 4),
>>>> ])
>>>>
>>>> Thanks for the reminder about cleangeo, I'll have closer look.
>>>>
>>>> cheers, Mike.
>>>>
>>>>>
>>>>> Roger
>>>>>
>>>>>>
>>>>>> Cheers, Mike.
>>>>>>
>>>>>> On Wed, 4 Nov 2015 at 18:16 Roger Bivand <Roger.Bivand at nhh.no> wrote:
>>>>>>
>>>>>>> On Tue, 3 Nov 2015, Matt Strimas-Mackey wrote:
>>>>>>>
>>>>>>>> I'm working with a regular hexagonal grid stored as SPDF. At some
>>>>>>>> point I subset this SPDF, then want to combine all adjacent hexagons
>>>>>>>> together so that each contiguous set of hexagons is a single polygon.
>>>>>>>> I'm doing this last step using gUnaryUnion (or gUnionCascaded, not
>>>>>>>> clear what the different is actually). The problem is that in some
>>>>>>>> cases boundaries between clearly adjacent polygons are not dissolved.
>>>>>>>>
>>>>>>>> Example:
>>>>>>>>
>>>>>>>> ## Create three adjacent hexagons
>>>>>>>> library(sp)
>>>>>>>> library(rgeos)
>>>>>>>> p1 <- Polygon(cbind(
>>>>>>>>      c(1276503.26781119, 1281876.11747031, 1287248.96712942,
>>>>>>>>        1287248.96712942, 1281876.11747031, 1276503.26781119,
>>>>>>>
>>>>>>> 1276503.26781119),
>>>>>>>>
>>>>>>>>      c(204391.40834643, 207493.42454344, 204391.40834643,
>>>>>>>
>>>>>>> 198187.37595242,
>>>>>>>>
>>>>>>>>        195085.35975541, 198187.37595242, 204391.40834643)))
>>>>>>>> p2 <- Polygon(cbind(
>>>>>>>>      c(1287248.96712943, 1292621.81678854, 1297994.66644766,
>>>>>>>>        1297994.66644766, 1292621.81678854, 1287248.96712943,
>>>>>>>
>>>>>>> 1287248.96712943),
>>>>>>>>
>>>>>>>>      c(204391.40834643, 207493.42454344, 204391.40834643,
>>>>>>>
>>>>>>> 198187.37595242,
>>>>>>>>
>>>>>>>>        195085.35975541, 198187.37595242, 204391.40834643)))
>>>>>>>> p3 <- Polygon(cbind(
>>>>>>>>      c(1281876.11747031, 1287248.96712943, 1292621.81678854,
>>>>>>>>        1292621.81678854, 1287248.96712943, 1281876.11747031,
>>>>>>>
>>>>>>> 1281876.11747031),
>>>>>>>>
>>>>>>>>      c(213697.45693745, 216799.47313446, 213697.45693745,
>>>>>>>
>>>>>>> 207493.42454344,
>>>>>>>>
>>>>>>>>        204391.40834643, 207493.42454344, 213697.45693745)))
>>>>>>>> spoly <- SpatialPolygons(list(Polygons(list(p1, p2, p3), 's1')))
>>>>>>>> plot(gUnaryUnion(spoly))
>>>>>>>
>>>>>>>
>>>>>>> No, this is just numerical fuzz:
>>>>>>>
>>>>>>> plot(spoly)
>>>>>>> plot(gUnaryUnion(spoly), border="green", lty=3, lwd=2, add=TRUE)
>>>>>>> oS <- getScale()
>>>>>>> # default 1e+8
>>>>>>> setScale(1e+4)
>>>>>>> plot(gUnaryUnion(spoly), border="orange", lwd=2, add=TRUE)
>>>>>>> setScale(oS)
>>>>>>>
>>>>>>> JTS, GEOS, and consequently rgeos by default shift all coordinates to
>>>>>>> an
>>>>>>> integer grid after multiplying by a scale factor (finding integer
>>>>>>> matches
>>>>>>> is much easier than real matches). If the scaling is too detailed (in
>>>>>>> some
>>>>>>> cases), the operations do not give the expected outcomes.
>>>>>>>
>>>>>>> There is work in progress in GEOS and JTS to provide other scaling
>>>>>>> options
>>>>>>> and models, and to permit iteration over scaling values until a
>>>>>>> "clean"
>>>>>>> result is obtained (for some meanings of clean).
>>>>>>>
>>>>>>> gUnionCascaded() was the only possible function for GEOS < 3.3.0, from
>>>>>>> GEOS 3.3.0 gUnaryUnion() is available and the prefered and more
>>>>>>> efficient
>>>>>>> route. This is explained on the help page.
>>>>>>>
>>>>>>> Hope this clarifies,
>>>>>>>
>>>>>>> Roger
>>>>>>>
>>>>>>>>
>>>>>>>> Note that p2 and p3 are dissolved together, but p1 is separate. The
>>>>>>>> shared edge of p1 and p2 is:
>>>>>>>> p1:
>>>>>>>> [2,] 1281876 207493.4
>>>>>>>> [3,] 1287249 204391.4
>>>>>>>> p2:
>>>>>>>> [5,] 1287249 204391.4
>>>>>>>> [6,] 1281876 207493.4
>>>>>>>>
>>>>>>>> So, exactly the same apart from the order. I originally thought this
>>>>>>>> difference in order might be the problem, but this doesn't seem to be
>>>>>>>> an issue with in this example, where order is also flipped:
>>>>>>>> sss <- rasterToPolygons(raster(nrow=2, ncol=2, xmn=0, xmx=2, ymn=0,
>>>>>>>> ymx=2, vals=1:4))
>>>>>>>> lapply(sss at polygons, function(x) slot(x, 'Polygons')[[1]]@coords)
>>>>>>>> plot(sss)
>>>>>>>> plot(gUnaryUnion(sss))
>>>>>>>>
>>>>>>>> Session Info:
>>>>>>>> R version 3.2.2 (2015-08-14)
>>>>>>>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>>>>>>> Running under: OS X 10.10.5 (Yosemite)
>>>>>>>>
>>>>>>>> Message when rgeos is loaded:
>>>>>>>>
>>>>>>>> rgeos version: 0.3-14, (SVN revision 511)
>>>>>>>> GEOS runtime version: 3.5.0-CAPI-1.9.0 r0
>>>>>>>> Linking to sp version: 1.2-0
>>>>>>>> Polygon checking: TRUE
>>>>>>>>
>>>>>>>> Any help on how to get these polygons to dissolve is appreciated.
>>>>>>>>
>>>>>>>> M
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> R-sig-Geo mailing list
>>>>>>>> R-sig-Geo at r-project.org
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Roger Bivand
>>>>>>> Department of Economics, Norwegian School of Economics,
>>>>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>>>>> voice: +47 55 95 93 55; fax +47 55 95 91 00
>>>>>>> e-mail: Roger.Bivand at nhh.no
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-Geo mailing list
>>>>>>> R-sig-Geo at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Roger Bivand
>>>>> Department of Economics, Norwegian School of Economics,
>>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>>> voice: +47 55 95 93 55; fax +47 55 95 91 00
>>>>> e-mail: Roger.Bivand at nhh.no
>>>>>
>>>>
>>>
>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; fax +47 55 95 91 00
>> e-mail: Roger.Bivand at nhh.no
>>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list