[R-sig-Geo] rgeos 'union' causes R crash

Roger Bivand Roger.Bivand at nhh.no
Thu Sep 8 23:56:31 CEST 2011


On Thu, 8 Sep 2011, Ben Weinstein wrote:

> Thanks for taking the time to respond. Quick followup. I looked at my task
> manager, and although i do see a decent spike in CPU usage, memory usage
> seems flat. Furthermore a call to memory.size() shows that each loop is
> using around 50mb of memory. I am using a pretty top machine which has
> 8gigs, and atleast 3gig are free during the run. The memory.limit() is the
> full 8 gigs, and is not capped, mem.limits() returns NA. During the run teh
> physical memory never goes past 30%.
>
> The script runs if you do 3 files, 10 files, but somewhere around 20 it
> becomes less dependable. I don't really follow, if it is able to do it once,
> and write it to a matrix, why is it taking more memory to loop? I've added
> gc() after my union step to try to ensure that the garbage sweep is
> happening. The updated code is shown below.
>
> More interestingly, there seems to be something about number of time within
> R this runs. So if the first time i run this, i take in 20 files, it works
> great. I can then go, remove everything, and rerun it for 19 files, and it
> will crash!

Ben,

If you can make a zip archive with the shapefiles available (as a link), 
then it will be possible to check.

You may also note that gCentroid() is available in rgeos, giving a 
centroid for all member external rings, and that in fact each Polygons 
object has a centroid recorded (taking the value of the centroid of the 
largest member polygon, so differing in multi-ring cases only). They are 
accessed by coordinates() of a SpatialPolygons object. This is of course 
not a solution, but once we get that far, the conversion to a PolySet 
object may be redundant.

Roger

>
> #load required libraries
> library(rgdal)
> library(PBSmapping)
> library(maptools)
> library(raster)
> library(rgeos)
> #Enter File Directory Below
> setwd("D:/Ben/R/Trochilidae")
> #List all shapefiles in that folder that are polygons
> a<-list.files(pattern='pl.shp$', full.names=FALSE)
> #create an output matrix to store polygon centroids
> output<-matrix(nrow=length(a),ncol=3)
> #name the output matrix columns
> colnames(output)<-c("Sci_Name","X_centroid","Y_centroid")
> create a for loop
> for(x in 1:length(a)){
> #remove the .shp from the file name, add it to the output matrix
> output[(x),1]<-paste(gsub(".shp","",a[x]))
> #read in a single shapefile
> range<-readShapePoly(a[x], delete_null_obj=TRUE)
> #subset the data for Presence and Origin (species are NOT extinct, and
> breeding ranges only)
> subsetx<-range[range$PRESENCE == 1,]
> rm(range)
> subsetx2<-subsetx[subsetx$ORIGIN ==1|subsetx$ORIGIN==2,]
> rm(subsetx)
> #dissolve the polygons to make one big file, this is the problematic step
> Union<-unionSpatialPolygons(subsetx2,IDs=subsetx2$ENGL_NAME)
> gc()
> #Change data types
> Breeding.ps<- SpatialPolygons2PolySet(Union)
> rm(Union)
> #find the centroids of the new polygons
> Breeding.centroids<- calcCentroid(Breeding.ps, rollup=1)
> rm(Breeding.ps)
> output[x,2:3]<-as.matrix(Breeding.centroids[2:3])}
>
> thanks
>
> I appreciate the help
>
> ben
>
>
> On Wed, Sep 7, 2011 at 4:34 PM, Colin Rundel <rundel at gmail.com> wrote:
>
>> Hi Ben,
>>
>> My best guess based on the information you have provided is that R is
>> running out of memory, particularly for complex polygons Unions can use a
>> huge amount of memory. Can you check with the task manager what the memory
>> usage looks like when running your code?
>>
>> -Colin
>>
>> On Sep 7, 2011, at 8:26 AM, Ben Weinstein wrote:
>>
>>> Hi all-
>>>
>>> I'm having an odd issue with the gUnion family of commands from the rgeos
>>> package. I'm importing a series of polygons (species) and trying to find
>> the
>>> centroid. Each species is only allowed one centroids, so i need to union
>> the
>>> multiple polygons. this code works great when run line by line. But when
>> run
>>> as an actual loop, but adding the {} at beginning and end, it gives me a
>>> runtime error. I have debugged it to the level that i know it is the
>> union
>>> polygon step. I am running R 2.13, and it was designed for 2.13.1, is
>> that
>>> really making the difference? When run, R completely freezes and windows
>>> shuts it down.
>>>
>>> library(rgdal)
>>> library(PBSmapping)
>>> library(maptools)
>>> library(raster)
>>> library(rgeos)
>>> #Enter File Directory Below
>>> setwd("D:/Ben/R/Trochilidae")
>>> #Pull in all shapefiles in that folder that are polygons
>>> a<-list.files(pattern='pl.shp$', full.names=FALSE)
>>> output<-matrix(nrow=length(a),ncol=3)
>>> colnames(output)<-c("Sci_Name","X_centroid","Y_centroid")
>>> for(x in 1:length(a)){
>>> output[(x),1]<-paste(gsub(".shp","",a[x]))
>>> range<-readShapePoly(a[x], delete_null_obj=TRUE)
>>> subsetx<-range[range$PRESENCE == 1,]
>>> subsetx2<-subsetx[subsetx$ORIGIN ==1|subsetx$ORIGIN==2,]
>>> z<-matrix(length(subsetx2),1, data=1)
>>> Union<-unionSpatialPolygons(subsetx2,z)
>>> Breeding.ps<- SpatialPolygons2PolySet(Union)
>>> Breeding.centroids<- calcCentroid(Breeding.ps, rollup=1)
>>> output[x,2:3]<-as.matrix(Breeding.centroids[2:3])}
>>>
>>> thanks
>>>
>>> ben weinstein
>>>
>>>
>>>
>>>
>>> --
>>> Ben Weinstein
>>> Graduate Student
>>> Ecology and Evolution
>>> Stony Brook University
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>>
>
>
>
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list