[R-sig-Geo] gUnion: no outgoing dirEdge found

Sat Jun 18 14:56:41 CEST 2011

On Fri, 17 Jun 2011, Brian J. Stults wrote:

> On 06/17/2011 04:03 PM, Roger Bivand wrote:
>> On Fri, 17 Jun 2011, Brian J. Stults wrote:
>>
>> no error exit - is this a script running on a node, if so, could you
>> catch errors with try() or similar means? I am in doubt about whether
>
> Yes, it is a script.  Wrapping the command in try() or tryCatch()
> resulted in the same output, as did running the commands interactively.
>
>> your approach is sensible, and suspect that running gEquals(de1990,
>> de2000, byid=TRUE) first to establish the duplicates, then tackle the
>> ones which are not seen as equal. Most of the polygons are the same:
>
> Delaware 1990-2000 is probably not a good example since tracts did not
> change much.  Indeed, that's why I chose it to test with, so it might
> run more quickly and be less confusing to me.  When I union 1970 and
> 2010, there will be many more differences (though fewer tracted areas in
> 1970).
>

For your data, this works:

library(maptools)
de1990 <- readShapeSpatial("subset_1990_de")
de2000 <- readShapeSpatial("subset_2000_de")
library(rgeos)
slot(de1990, "polygons") <- lapply(slot(de1990, "polygons"),
   checkPolygonsHoles)
slot(de2000, "polygons") <- lapply(slot(de2000, "polygons"),
   checkPolygonsHoles)
row.names(de1990) <- paste("de1990", row.names(de1990), sep="_")
row.names(de2000) <- paste("de2000", row.names(de2000), sep="_")
eq <- gEquals(de1990, de2000, byid=TRUE)
nde1990 <- de1990[!apply(eq, 2, any),]
nde2000 <- de2000[!apply(eq, 1, any),]
eq1 <- eq[apply(eq, 1, any), apply(eq, 2, any)]
wh <- which(eq1, arr.ind=TRUE)
same <- de1990[apply(eq, 2, any),]
row.names(same) <- paste(row.names(same), row.names(de2000[apply(eq, 1,
   any),])[wh[,1]])
int <- gIntersects(nde1990, nde2000, byid=TRUE)
vec <- vector(mode="list", length=dim(int)[2])
for (i in seq(along=vec)) vec[[i]] <- try(gUnion(nde1990[i,],
   nde2000[int[,i],], byid=TRUE))
# needs checking for non-SpatialPolygons entities
out <- do.call("rbind", vec)

plot(out, col="cyan")
plot(same, col="magenta", add=TRUE)

out1 <- spRbind(out, same)

plot(de1990, col="cyan", border="magenta")
plot(out1, add=TRUE)

plot(de2000, col="cyan", border="magenta")
plot(out1, add=TRUE)

row.names(out1)

Apart from working, I think that it suggests that union is not the 
operation needed (the plot of merged borders on 1990 borders shows that 
some 1990 borders are not respected in the merged set), but perhaps rather 
a sequence of intersection, and two asymmetric differences, before 
combining on completion, so that all the parts get retained whether split 
or joined between periods.

For your larger project, you will also run into cleaning issues, and 
possibly also datum shift and precision model questions. The underlying 
data are presumably Tiger, which typically do need manual intervention, I 
believe. Have you tried to do this in a topological GIS, either ArcINFO or 
GRASS?

For me on GRASS 6.4.1, a sequence of two v.in.ogr commands in a suitable 
location and region, followed by v.overlay, then v.out.ogr, generates what 
is needed. Building GRASS for your cluster and scripting it in shell 
script should be easier than QGIS:

v.in.ogr -o dsn=. layer=subset_1990_de output=de1990
v.in.ogr -o dsn=. layer=subset_2000_de output=de2000
v.overlay ainput=de1990 binput=de2000 output=overlay
v.out.ogr -c input=overlay type=area dsn=. olayer=overlay

in region:

> g.region -p
projection: 3 (Latitude-Longitude)
zone:       0
datum:      nad83
ellipsoid:  grs80
north:      39N
south:      38:48N
west:       77:07:12W
east:       76:54W

Or from within R:

library(rgdal)
de2000 <- readOGR(".", "subset_2000_de")
de1990 <- readOGR(".", "subset_1990_de")
proj4string(de2000) <- CRS("+proj=longlat +datum=NAD83")
proj4string(de1990) <- CRS("+proj=longlat +datum=NAD83")
library(maptools)
SG <- Sobj_SpatialGrid(de1990)$SG
library(spgrass6)
# using a throw-away temporary location and
# my GRASS installation path
initGRASS("/home/rsb/topics/grass/g641/grass-6.4.1", tempdir(), SG)
# need to ensure that location is in correct
# coordinate reference system
execGRASS("g.proj", flags="p")
tf <- tempfile()
mapset <- execGRASS("g.gisenv", parameters=list(get="MAPSET"), 
intern=TRUE)
execGRASS("g.gisenv", parameters=list(set=shQuote('MAPSET=PERMANENT')))
prj <- showWKT(proj4string(de1990), tf)
execGRASS("g.proj", flags="c", parameters=list(wkt=tf))
execGRASS("g.proj", flags="p")
execGRASS("g.gisenv", parameters=list(set=paste("'MAPSET=", mapset,
  "'", sep="")))
execGRASS("g.region", flags="d")
# now in correct CRS
writeVECT6(de1990, "de1990")
writeVECT6(de2000, "de2000")
execGRASS("v.overlay", parameters=list(ainput="de1990", binput="de2000",
   output="overlay"))
o <- readVECT6("overlay", with_c=TRUE)
plot(de1990, col="cyan", border="magenta")
plot(o, add=TRUE)
plot(de2000, col="cyan", border="magenta")
plot(o, add=TRUE)

GRASS is doing what you need it to do. You don't need QGIS to use GRASS, 
but GRASS - because it is constructing the topology for each vector 
object, can avoid the awkwardness of using the OGC SFS constructions used 
in GEOS. So my advice would be to deploy GRASS, but you'll need to be 
careful to construct the locations, regions, and CRS suitably. The GRASS 
book by Neteler and Mitasova (2008, 3rd ed.), has a discussion of 
v.overlay on pp. 206-208.

Hope this helps,

Roger

>>
>> eq <- gEquals(de1990, de2000, byid=TRUE)
>> sum(apply(eq, 2, any))
>> sum(apply(eq, 1, any))
>> plot(de1990, col="cyan", border="transparent")
>> plot(de1990[apply(eq, 2, any),], add=TRUE)
>> nde1990 <- de1990[!apply(eq, 2, any),]
>> nde2000 <- de2000[!apply(eq, 1, any),]
>>
>> then step through int row by row (or column by column) to union the ones
>> for which the correct predicate is TRUE for that combination. I'm unsure
>> whether you want to dissolve to the 2000 file or the other way round, or
>> to create sub-units permiting either to be reconstructed, so I can't
>> suggest further steps now. The predicates should let you find the
>
> I want to retain all sub-units, so I think the last option you listed is
> what I want.
>
>> geometries on which topology operations are to be performed, which
>> should then work provided that the shapefiles are clean. The trick seems
>> to be in finding the right predicates, then the right operation, but it
>> needs interactive analysis to work through use cases and instrument a
>> script to be sure that the correct joins are being made.
>
> I very much look forward to being capable of doing what you suggest, but
> right now I think it is beyond my coding ability in R.  I hate to give
> up so quickly, but perhaps the better option for now is compiling Qgis
> on the supercomputer and running the union that way.  But I *will*
> eventually learn how to do it in R.
>
> Thanks!
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no