[R-sig-Geo] gUnion causes segfault

Roger Bivand Roger.Bivand at nhh.no
Thu Jun 2 23:27:12 CEST 2011


On Thu, 2 Jun 2011, Brian J. Stults wrote:

>> On Thu, 2 Jun 2011, Brian J. Stults wrote:
>>
>>> Hello,
>>>
>>> I am working with the 2009 Tiger/LINE topological faces files.  I want
>>> to create a shapefile with polygons for unique instances of state,
>>> county, place, and tract.  Since the topological faces shapefiles
>>> provide many smaller geographies, my approach has been to dissolve those
>>> smaller geographies into larger ones using the gUnion function from
>>> rgeos.  This works for most counties, but it causes segfaults for some.
>>> One example is Apache County, AL.  The shapefile is here:
>> Please always include the output of sessionInfo() in any report like
>> this. Both the OS for binary packages, and the specific version of
>> rgeos, may play a role.
>
> Thanks for looking into this.  Here is the sessionInfo output.
>
> R version 2.13.0 (2011-04-13)
> Platform: i486-pc-linux-gnu (32-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] rgeos_0.1-6     stringr_0.4     maptools_0.8-7  lattice_0.19-26
> [5] sp_0.9-82       foreign_0.8-44
>
> loaded via a namespace (and not attached):
> [1] grid_2.13.0 plyr_1.5.2
> Warning message:
> 'DESCRIPTION' file has 'Encoding' field and re-encoding is not possible


OK, thanks. Next, how was GEOS and the GEOS C API installed - from source, 
or from a Debian/Ubuntu or RPM package - if so, which?

>
>
>> I cannot check on a 1GB laptop, because the
>> shapefile creates a 180MB object and has 63K polygons.
>
> Here is a smaller county that results in the same problem:
>
> http://www2.census.gov/geo/tiger/TIGER2009/02_ALASKA/02013_Aleutians_East_Borough/tl_2009_02013_faces.zip
>
>
>> I don't know why
>> subsetting the columns in the data slot would help, but I do think that
>> your assignment back into the object is a hidrance in memory terms for
>> such a large object - provoking copies.
>
> I am (probably obviously) pretty new to R.  I thought that subsetting
> the columns would reduce the required memory.  Perhaps it was just the
> opposite.
>
>
>> I assumed that you do know that
>> you have no other way to dissolve so many polygons into so few (40)
>> output units - this isn't a typical use case. The version of rgeos may
>> matter, as protection against unclean objects provoking seg.faults has
>> recently been extended.
>
> I first tried using the intersection of the TIGER files for places and
> tracts using gIntersection since there are far fewer polygons to deal
> with in those files.  However, the processing took a prohibitively long
> time.  Using gUnionCascaded didn't take too, too long for the counties
> that did not cause a segfault.

I'll see whether building candidate intersections with the STRtree helps - 
it may not.

Roger

>
> Thanks,
> Brian
>
>
>>
>> Roger
>>
>>
>>>
>>> http://www2.census.gov/geo/tiger/TIGER2009/04_ARIZONA/04001_Apache_County/tl_2009_04001_faces.zip
>>>
>>>
>>> My code (modified to work on a single county) is:
>>>
>>> library(maptools)
>>> library(rgeos)
>>>
>>> tiger <- readShapePoly("tl_2009_04001_faces.shp",
>>> proj4string=CRS("+proj=longlat +ellps=GRS80 +datum=NAD83 +no_defs"))[,
>>> c(4,37)]
>>> tiger$PLACEFP <- as.character(tiger$PLACEFP)
>>> tiger$PLACEFP[is.na(tiger$PLACEFP)] <- "99999"
>>> tiger$uniqueid <- paste(tiger$PLACEFP00, tiger$TRACTCE00, sep="")
>>> tiger.dissolve <- gUnionCascaded(tiger, tiger$uniqueid)
>>> q()
>>>
>>>
>>> The error message is:
>>>
>>>> tiger.dissolve <- gUnionCascaded(tiger, tiger$uniqueid)
>>>
>>> *** caught segfault ***
>>> address 0x17c9, cause 'memory not mapped'
>>>
>>> Traceback:
>>> 1: .Call(func, .RGEOS_HANDLE, spgeom, id, byid, PACKAGE = "rgeos")
>>> 2: TopologyFunc(groupID(spgeom, id), unique(na.omit(id)), TRUE,
>>> "rgeos_unioncascaded")
>>> 3: gUnionCascaded(tiger, tiger$uniqueid)
>>> aborting ...
>>>
>>>
>>> The full output can be viewed here:
>>> http://www2.criminology.fsu.edu/~stults/misc/union_tiger.txt
>>>
>>>
>>> Can anyone tell me what is going wrong, or how to go about debugging the
>>> problem?  I uninstalled rgeos to force using UnionSpatialPolygons (there
>>> must be a better way to force that than uninstalling, right?), but that
>>> ran overnight and never finished.  I am guessing there must be something
>>> strange about these shapefiles.
>>>
>>> I can successfully dissolve this shapefile using the ftools module in
>>> qgis.  However, I want to do this for a large number of counties, which
>>> is why I am pursuing a programmed solution.  I tried doing it via python
>>> scripting with qgis, but could not get that to work after a lot of
>>> trying.  I am currently trying to do it with Spatialite, which seems
>>> promising.  I would be happy to hear any other suggested approaches.
>>>
>>> Thanks,
>>> Brian
>>>
>>>
>>
>
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list