[R-sig-Geo] gUnaryUnion Not Dissolving Correctly

Roger Bivand Roger.Bivand at nhh.no
Thu Nov 5 11:49:24 CET 2015


On Thu, 5 Nov 2015, Roger Bivand wrote:

> On Wed, 4 Nov 2015, Matt Strimas-Mackey wrote:
>
>>  The original study area shapefile is a boundary of the Indonesia half
>>  of New Guinea. The file as well as the code to construct the hexagonal
>>  grids are here:
>>  https://www.dropbox.com/sh/ff8v08p3ambqcbs/AAAPBlGP4fthdmZhrto7oIuCa?dl=0
>>
>>  Since it's a large area, generating the grid takes a long time, so
>>  I've also included code for a small subset of the original
>>  shapefile--one small offshore island.
>
> An equivalent scaling fix is to change the coordinate units from metres to 
> kilometres:
>
> library(rgdal)
> study_area2 <- spTransform(study_area, CRS("+proj=aea +lat_1=-7.42
>  +lat_2=-0.62 +lat_0=-9.119 +lon_0=128.91 +x_0=0 y_0=0 +datum=WGS84
>  +units=km +no_defs +ellps=WGS84 +towgs84=0,0,0"))
> size <- sqrt(2 * 1e2 / sqrt(3)) # reduce to suit
> set.seed(1)
> hex_points <- spsample(study_area2[2,], type="hexagonal", cellsize=size)
> hex_grid <- HexPoints2SpatialPolygons(hex_points)
> hex_union <- gUnaryUnion(hex_grid)
> plot(hex_grid, col='lightgrey')
> plot(hex_union, border='orange', lwd=3, add=T)
>
> An advantage of these scaling issues being made visible is that we see that 
> computers really do not operate in continuous space, and that computational 
> geometry actually matters, I suppose.
>
> I'll check the underlying code generating the hexagons to see why the nodes 
> (where hexagon boundaries meet) appear to slide apart at the edge of machine 
> precision.

A potentially robust approach at default scale uses the dx= argument to 
HexPoints2SpatialPolygons():

hex_grid <- HexPoints2SpatialPolygons(hex_points, dx=size)

Setting:

set_RGEOS_polyThreshold(1e-2) # for example
set_RGEOS_warnSlivers(TRUE)

shows the remaining slivers, and:

set_RGEOS_dropSlivers(TRUE)

drops them. It is still possible that an inward dangle will get through 
(zero area line in from boundary point, but part of the single boundary).

Setting dx= to the same value as cellsize= in the spsample call seems to 
be crucial, avoiding an approximation marked in the code as:

 	# EJP; changed:
 	# how to figure out dx from a grid? THK suggested:
         #dx <- hexGrid$x[2] - hexGrid$x[1]
 	# and the following will also not always work:

in sp:::genPolyList(), so there was a warning there against taking the 
default.

Roger

>
> Roger
>
>
>>
>>  Finally, some more odd behaviour. I noticed each time I run this code,
>>  the dissolve mistakes change, i.e. different boundaries are
>>  erroneously kept. However, using set.seed() makes the errors the same
>>  each time for a given seed, and changing the seed yields a different
>>  set of errors. Example in the code in the dropbox link and copied
>>  here:
>>
>>  library(sp)
>>  library(raster)
>>  library(rgeos)
>>
>>  # just a subset of full shapefile
>>  set.seed(1)
>>  size <- sqrt(2 * 1e8 / sqrt(3))
>>  study_area <- shapefile('papua.shp')
>>  hex_points <- spsample(study_area[2,], type="hexagonal", cellsize=size)
>>  hex_grid <- HexPoints2SpatialPolygons(hex_points)
>>  hex_union <- gUnaryUnion(hex_grid)
>>  plot(hex_grid, col='lightgrey')
>>  plot(hex_union, border='orange', lwd=3, add=T)
>>
>>  # results chage according to seed
>>  set.seed(100)
>>  size <- sqrt(2 * 1e8 / sqrt(3))
>>  study_area <- shapefile('papua.shp')
>>  hex_points <- spsample(study_area[2,], type="hexagonal", cellsize=size)
>>  hex_grid <- HexPoints2SpatialPolygons(hex_points)
>>  hex_union <- gUnaryUnion(hex_grid)
>>  plot(hex_grid, col='lightgrey')
>>  plot(hex_union, border='orange', lwd=3, add=T)
>>
>>  On Wed, Nov 4, 2015 at 8:57 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>> >  On Wed, 4 Nov 2015, Matt Strimas-Mackey wrote:
>> > 
>> > >  Thanks, lots of useful info here. I've never seen the setScale()
>> > >  function; I don't think it's mentioned in the gUnaryUnion help. This
>> > >  saves me a lot of headache!
>> > > 
>> > >  For what it's worth, the invalid geometry is an artifact of the
>> > >  reproducible example I created. The original hexagonal grid is
>> > >  produced with
>> > >  g <- spsample(study_area, type="hexagonal", cellsize=size)
>> > >  hex_grid <- HexPoints2SpatialPolygons(g)
>> > 
>> > 
>> >  OK, thanks - this is useful. Could you please make available the study 
>> >  area
>> >  object in some way - so that we can re-create g and see how
>> >  HexPoints2SpatialPolygons() creates the artefact Mike spotted (although 
>> >  this
>> >  looks like numeric fuzz - 'changing "1287248.96712942" to
>> >  "1287248.96712943"' is a change in the 15th digit, which is on the 
>> >  precision
>> >  edge of the "double" storage mode. If we can revisit functions creating
>> >  SpatialPolygons objects to ensure that they are GEOS-compatible for the
>> >  default scale of 1e+8, we'll be more secure.
>> > 
>> >  Roger
>> > 
>> > 
>> > > 
>> > >  And this object passes gIsValid() and clgeo_GeometryReport() without
>> > >  any problems, yet still has the dissolving issue. Regardless, all is
>> > >  solved with setScale().
>> > > 
>> > >  Thanks!
>> > > 
>> > >  M
>> > > 
>> > >  On Wed, Nov 4, 2015 at 6:30 AM, Michael Sumner <mdsumner at gmail.com> 
>> > >  wrote:
>> > > > 
>> > > > 
>> > > > 
>> > > >  On Thu, 5 Nov 2015 at 01:10 Roger Bivand <Roger.Bivand at nhh.no> 
>> > > >  wrote:
>> > > > > 
>> > > > > 
>> > > > >  On Wed, 4 Nov 2015, Michael Sumner wrote:
>> > > > > 
>> > > > > >  Thanks for all this detail Roger, is there a way to "re-build" a
>> > > > > >  spatial
>> > > > > >  object so that the given scale setting is applied? Are there any
>> > > > > >  general
>> > > > > >  rounding or "orthogonalize" functions in the Spatial suite?
>> > > > > 
>> > > > > 
>> > > > >  No, not really. In this case, the very detailed coordinate 
>> > > > >  measurements
>> > > > >  may have made things worse, or possibly using Polygon rather than
>> > > > >  Polygons
>> > > > >  objects, or not building the Polygons object with a proper comment
>> > > > >  attribute, I don't know. rgeos::gIsValid(spoly) is FALSE, and
>> > > > > 
>> > > > > 
>> > > > >  cleangeo::clgeo_GeometryReport(clgeo_CollectionReport(clgeo_Clean(spoly)))
>> > > > > 
>> > > > >  says the same (based on rgeos). I suggest working with Emmanuel 
>> > > > >  Blondel
>> > > > >  (cleangeo maintainer) to extend cleangeo. GEOS is looking at 
>> > > > >  allowing
>> > > > >  users to manipulate precision models, not just scale, but I'm 
>> > > > >  uncertain
>> > > > >  about that.
>> > > > > 
>> > > > >  Running spoly into GRASS and back out (GRASS builds topology on 
>> > > > >  import)
>> > > > >  shows a different error, the object seems to be problematic.
>> > > > > 
>> > > > 
>> > > >  It can be fixed by changing "1287248.96712942" to 
>> > > >  "1287248.96712943", so
>> > > >  really the creator should not have had those values on input. 
>> > > >  There's no
>> > > >  easy way out without topology.
>> > > > 
>> > > >  This brought out some interesting issues for work I've been doing 
>> > > >  using
>> > > >  the
>> > > >  "near-Delaunay triangulation" in RTriangle, and that requires a
>> > > >  normalized
>> > > >  set of vertices (no duplicate vertices) which on its own presents
>> > > >  interesting problems. I have a related issue where a parallel 
>> > > >  (latitude)
>> > > >  line needs many vertices to look "smooth" on a polar projection, but 
>> > > >  when
>> > > >  building a mesh with triangles it's really best to allow relatively
>> > > >  coarse
>> > > >  segmented boundaries rather than have many elements at parallels. 
>> > > >  The
>> > > >  Triangle library does not consider these hexagon coordinates to be
>> > > >  duplicates, so there are two vertical segments between the two 
>> > > >  bottom
>> > > >  polys
>> > > >  at
>> > > > 
>> > > >  points(coordinates(as(as(spoly, "SpatialLines"), 
>> > > >  "SpatialPoints"))[c(3,
>> > > >  4),
>> > > > ] ) 
>> > > > 
>> > > >  Thanks for the reminder about cleangeo, I'll have closer look.
>> > > > 
>> > > >  cheers, Mike.
>> > > > 
>> > > > > 
>> > > > >  Roger
>> > > > > 
>> > > > > > 
>> > > > > >  Cheers, Mike.
>> > > > > > 
>> > > > > >  On Wed, 4 Nov 2015 at 18:16 Roger Bivand <Roger.Bivand at nhh.no> 
>> > > > > >  wrote:
>> > > > > > 
>> > > > > > >  On Tue, 3 Nov 2015, Matt Strimas-Mackey wrote:
>> > > > > > > 
>> > > > > > > >  I'm working with a regular hexagonal grid stored as SPDF. At 
>> > > > > > > >  some
>> > > > > > > >  point I subset this SPDF, then want to combine all adjacent 
>> > > > > > > >  hexagons
>> > > > > > > >  together so that each contiguous set of hexagons is a single 
>> > > > > > > >  polygon.
>> > > > > > > >  I'm doing this last step using gUnaryUnion (or 
>> > > > > > > >  gUnionCascaded, not
>> > > > > > > >  clear what the different is actually). The problem is that 
>> > > > > > > >  in some
>> > > > > > > >  cases boundaries between clearly adjacent polygons are not 
>> > > > > > > >  dissolved.
>> > > > > > > > 
>> > > > > > > >  Example:
>> > > > > > > > 
>> > > > > > > >  ## Create three adjacent hexagons
>> > > > > > > >  library(sp)
>> > > > > > > >  library(rgeos)
>> > > > > > > >  p1 <- Polygon(cbind(
>> > > > > > > >       c(1276503.26781119, 1281876.11747031, 1287248.96712942,
>> > > > > > > >         1287248.96712942, 1281876.11747031, 1276503.26781119,
>> > > > > > > 
>> > > > > > >  1276503.26781119),
>> > > > > > > > 
>> > > > > > > >       c(204391.40834643, 207493.42454344, 204391.40834643,
>> > > > > > > 
>> > > > > > >  198187.37595242,
>> > > > > > > > 
>> > > > > > > >         195085.35975541, 198187.37595242, 204391.40834643)))
>> > > > > > > >  p2 <- Polygon(cbind(
>> > > > > > > >       c(1287248.96712943, 1292621.81678854, 1297994.66644766,
>> > > > > > > >         1297994.66644766, 1292621.81678854, 1287248.96712943,
>> > > > > > > 
>> > > > > > >  1287248.96712943),
>> > > > > > > > 
>> > > > > > > >       c(204391.40834643, 207493.42454344, 204391.40834643,
>> > > > > > > 
>> > > > > > >  198187.37595242,
>> > > > > > > > 
>> > > > > > > >         195085.35975541, 198187.37595242, 204391.40834643)))
>> > > > > > > >  p3 <- Polygon(cbind(
>> > > > > > > >       c(1281876.11747031, 1287248.96712943, 1292621.81678854,
>> > > > > > > >         1292621.81678854, 1287248.96712943, 1281876.11747031,
>> > > > > > > 
>> > > > > > >  1281876.11747031),
>> > > > > > > > 
>> > > > > > > >       c(213697.45693745, 216799.47313446, 213697.45693745,
>> > > > > > > 
>> > > > > > >  207493.42454344,
>> > > > > > > > 
>> > > > > > > >         204391.40834643, 207493.42454344, 213697.45693745)))
>> > > > > > > >  spoly <- SpatialPolygons(list(Polygons(list(p1, p2, p3), 
>> > > > > > > >  's1')))
>> > > > > > > >  plot(gUnaryUnion(spoly))
>> > > > > > > 
>> > > > > > > 
>> > > > > > >  No, this is just numerical fuzz:
>> > > > > > > 
>> > > > > > >  plot(spoly)
>> > > > > > >  plot(gUnaryUnion(spoly), border="green", lty=3, lwd=2, 
>> > > > > > >  add=TRUE)
>> > > > > > >  oS <- getScale()
>> > > > > > >  # default 1e+8
>> > > > > > >  setScale(1e+4)
>> > > > > > >  plot(gUnaryUnion(spoly), border="orange", lwd=2, add=TRUE)
>> > > > > > >  setScale(oS)
>> > > > > > > 
>> > > > > > >  JTS, GEOS, and consequently rgeos by default shift all 
>> > > > > > >  coordinates to
>> > > > > > >  an
>> > > > > > >  integer grid after multiplying by a scale factor (finding 
>> > > > > > >  integer
>> > > > > > >  matches
>> > > > > > >  is much easier than real matches). If the scaling is too 
>> > > > > > >  detailed (in
>> > > > > > >  some
>> > > > > > >  cases), the operations do not give the expected outcomes.
>> > > > > > > 
>> > > > > > >  There is work in progress in GEOS and JTS to provide other 
>> > > > > > >  scaling
>> > > > > > >  options
>> > > > > > >  and models, and to permit iteration over scaling values until 
>> > > > > > >  a
>> > > > > > >  "clean"
>> > > > > > >  result is obtained (for some meanings of clean).
>> > > > > > > 
>> > > > > > >  gUnionCascaded() was the only possible function for GEOS < 
>> > > > > > >  3.3.0, from
>> > > > > > >  GEOS 3.3.0 gUnaryUnion() is available and the prefered and 
>> > > > > > >  more
>> > > > > > >  efficient
>> > > > > > >  route. This is explained on the help page.
>> > > > > > > 
>> > > > > > >  Hope this clarifies,
>> > > > > > > 
>> > > > > > >  Roger
>> > > > > > > 
>> > > > > > > > 
>> > > > > > > >  Note that p2 and p3 are dissolved together, but p1 is 
>> > > > > > > >  separate. The
>> > > > > > > >  shared edge of p1 and p2 is:
>> > > > > > > >  p1:
>> > > > > > > >  [2,] 1281876 207493.4
>> > > > > > > >  [3,] 1287249 204391.4
>> > > > > > > >  p2:
>> > > > > > > >  [5,] 1287249 204391.4
>> > > > > > > >  [6,] 1281876 207493.4
>> > > > > > > > 
>> > > > > > > >  So, exactly the same apart from the order. I originally 
>> > > > > > > >  thought this
>> > > > > > > >  difference in order might be the problem, but this doesn't 
>> > > > > > > >  seem to be
>> > > > > > > >  an issue with in this example, where order is also flipped:
>> > > > > > > >  sss <- rasterToPolygons(raster(nrow=2, ncol=2, xmn=0, xmx=2, 
>> > > > > > > >  ymn=0,
>> > > > > > > >  ymx=2, vals=1:4))
>> > > > > > > >  lapply(sss at polygons, function(x) slot(x, 
>> > > > > > > >  'Polygons')[[1]]@coords)
>> > > > > > > >  plot(sss)
>> > > > > > > >  plot(gUnaryUnion(sss))
>> > > > > > > > 
>> > > > > > > >  Session Info:
>> > > > > > > >  R version 3.2.2 (2015-08-14)
>> > > > > > > >  Platform: x86_64-apple-darwin13.4.0 (64-bit)
>> > > > > > > >  Running under: OS X 10.10.5 (Yosemite)
>> > > > > > > > 
>> > > > > > > >  Message when rgeos is loaded:
>> > > > > > > > 
>> > > > > > > >  rgeos version: 0.3-14, (SVN revision 511)
>> > > > > > > >  GEOS runtime version: 3.5.0-CAPI-1.9.0 r0
>> > > > > > > >  Linking to sp version: 1.2-0
>> > > > > > > >  Polygon checking: TRUE
>> > > > > > > > 
>> > > > > > > >  Any help on how to get these polygons to dissolve is 
>> > > > > > > >  appreciated.
>> > > > > > > > 
>> > > > > > > >  M
>> > > > > > > > 
>> > > > > > > >  _______________________________________________
>> > > > > > > >  R-sig-Geo mailing list
>> > > > > > > >  R-sig-Geo at r-project.org
>> > > > > > > >  https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> > > > > > > > 
>> > > > > > > 
>> > > > > > >  --
>> > > > > > >  Roger Bivand
>> > > > > > >  Department of Economics, Norwegian School of Economics,
>> > > > > > >  Helleveien 30, N-5045 Bergen, Norway.
>> > > > > > >  voice: +47 55 95 93 55; fax +47 55 95 91 00
>> > > > > > >  e-mail: Roger.Bivand at nhh.no
>> > > > > > > 
>> > > > > > >  _______________________________________________
>> > > > > > >  R-sig-Geo mailing list
>> > > > > > >  R-sig-Geo at r-project.org
>> > > > > > >  https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> > > > > > > 
>> > > > > > 
>> > > > > 
>> > > > >  --
>> > > > >  Roger Bivand
>> > > > >  Department of Economics, Norwegian School of Economics,
>> > > > >  Helleveien 30, N-5045 Bergen, Norway.
>> > > > >  voice: +47 55 95 93 55; fax +47 55 95 91 00
>> > > > >  e-mail: Roger.Bivand at nhh.no
>> > > > > 
>> > > > 
>> > > 
>> > 
>> >  --
>> >  Roger Bivand
>> >  Department of Economics, Norwegian School of Economics,
>> >  Helleveien 30, N-5045 Bergen, Norway.
>> >  voice: +47 55 95 93 55; fax +47 55 95 91 00
>> >  e-mail: Roger.Bivand at nhh.no
>> > 
>> 
>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 91 00
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list