[R-sig-Geo] gUnion causes segfault

Brian J. Stults bstults at fsu.edu
Thu Jun 2 20:10:31 CEST 2011


> On Thu, 2 Jun 2011, Brian J. Stults wrote:
> 
>> Hello,
>>
>> I am working with the 2009 Tiger/LINE topological faces files.  I want
>> to create a shapefile with polygons for unique instances of state,
>> county, place, and tract.  Since the topological faces shapefiles
>> provide many smaller geographies, my approach has been to dissolve those
>> smaller geographies into larger ones using the gUnion function from
>> rgeos.  This works for most counties, but it causes segfaults for some.
>> One example is Apache County, AL.  The shapefile is here:
> Please always include the output of sessionInfo() in any report like
> this. Both the OS for binary packages, and the specific version of
> rgeos, may play a role.

Thanks for looking into this.  Here is the sessionInfo output.

R version 2.13.0 (2011-04-13)
Platform: i486-pc-linux-gnu (32-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] rgeos_0.1-6     stringr_0.4     maptools_0.8-7  lattice_0.19-26
[5] sp_0.9-82       foreign_0.8-44

loaded via a namespace (and not attached):
[1] grid_2.13.0 plyr_1.5.2
Warning message:
'DESCRIPTION' file has 'Encoding' field and re-encoding is not possible


> I cannot check on a 1GB laptop, because the
> shapefile creates a 180MB object and has 63K polygons. 

Here is a smaller county that results in the same problem:

http://www2.census.gov/geo/tiger/TIGER2009/02_ALASKA/02013_Aleutians_East_Borough/tl_2009_02013_faces.zip


> I don't know why
> subsetting the columns in the data slot would help, but I do think that
> your assignment back into the object is a hidrance in memory terms for
> such a large object - provoking copies. 

I am (probably obviously) pretty new to R.  I thought that subsetting
the columns would reduce the required memory.  Perhaps it was just the
opposite.


> I assumed that you do know that
> you have no other way to dissolve so many polygons into so few (40)
> output units - this isn't a typical use case. The version of rgeos may
> matter, as protection against unclean objects provoking seg.faults has
> recently been extended.

I first tried using the intersection of the TIGER files for places and
tracts using gIntersection since there are far fewer polygons to deal
with in those files.  However, the processing took a prohibitively long
time.  Using gUnionCascaded didn't take too, too long for the counties
that did not cause a segfault.

Thanks,
Brian


> 
> Roger
> 
> 
>>
>> http://www2.census.gov/geo/tiger/TIGER2009/04_ARIZONA/04001_Apache_County/tl_2009_04001_faces.zip
>>
>>
>> My code (modified to work on a single county) is:
>>
>> library(maptools)
>> library(rgeos)
>>
>> tiger <- readShapePoly("tl_2009_04001_faces.shp",
>> proj4string=CRS("+proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs"))[,
>> c(4,37)]
>> tiger$PLACEFP <- as.character(tiger$PLACEFP)
>> tiger$PLACEFP[is.na(tiger$PLACEFP)] <- "99999"
>> tiger$uniqueid <- paste(tiger$PLACEFP00, tiger$TRACTCE00, sep="")
>> tiger.dissolve <- gUnionCascaded(tiger, tiger$uniqueid)
>> q()
>>
>>
>> The error message is:
>>
>>> tiger.dissolve <- gUnionCascaded(tiger, tiger$uniqueid)
>>
>> *** caught segfault ***
>> address 0x17c9, cause 'memory not mapped'
>>
>> Traceback:
>> 1: .Call(func, .RGEOS_HANDLE, spgeom, id, byid, PACKAGE = "rgeos")
>> 2: TopologyFunc(groupID(spgeom, id), unique(na.omit(id)), TRUE,
>> "rgeos_unioncascaded")
>> 3: gUnionCascaded(tiger, tiger$uniqueid)
>> aborting ...
>>
>>
>> The full output can be viewed here:
>> http://www2.criminology.fsu.edu/~stults/misc/union_tiger.txt
>>
>>
>> Can anyone tell me what is going wrong, or how to go about debugging the
>> problem?  I uninstalled rgeos to force using UnionSpatialPolygons (there
>> must be a better way to force that than uninstalling, right?), but that
>> ran overnight and never finished.  I am guessing there must be something
>> strange about these shapefiles.
>>
>> I can successfully dissolve this shapefile using the ftools module in
>> qgis.  However, I want to do this for a large number of counties, which
>> is why I am pursuing a programmed solution.  I tried doing it via python
>> scripting with qgis, but could not get that to work after a lot of
>> trying.  I am currently trying to do it with Spatialite, which seems
>> promising.  I would be happy to hear any other suggested approaches.
>>
>> Thanks,
>> Brian
>>
>>
>



More information about the R-sig-Geo mailing list