[R-sig-Geo] Memory management with rasterToPolygons (raster and rgeos)

pgalpern pgalpern at gmail.com
Thu Jan 5 19:49:57 CET 2012


Agreed these are very large objects.  I'll look at tiling as a general 
solution to this problem.

For others facing the same challenge it is worth noting that I have been 
successful in running rasterToPolygons(x, dissolve=TRUE) on rasters up 
to 800000 cells producing an object containing approx  1500 
SpatialPolygons under 64bit Windows, by ensuring there is at least 7GB 
of overhead memory.  Run time was reasonable.  R instance must be 
terminated following function call to free up the memory.


On 05/01/2012 12:19 PM, Roger Bivand wrote:
> On Thu, 5 Jan 2012, pgalpern wrote:
>
>> I did some further research into my own question when I twigged to 
>> the idea that this might be a memory leak with the GEOS library.
>>
>> It seems likely that is, and has been documented in this forum this 
>> October past: 
>> https://mailman.stat.ethz.ch/pipermail/r-sig-geo/2011-October/013289.html
>>
>> As of October there didn't appear to be any real resolution to the 
>> problem, except - perhaps - to run rgeos under Linux.
>>
>> Is this the status quo?
>
> The issue with rgeos/GEOS is unresolved, and has led to at least two 
> releases in the mean time. Using Linux does not help. It may be 
> possible to run with dissolve=FALSE, and step through chunks of 
> feature classes, in separate R scripts. However, it isn't just an 
> rgeos issue, as:
>
> library(raster)
> r <- raster(nrow=500, ncol=1000)
> r[] <- rpois(ncell(r), lambda=70)
> pol <- rasterToPolygons(r, dissolve=FALSE)
>
> gives me:
>
>> object.size(r)
> 4011736 bytes
>> object.size(pol)
> 1458003216 bytes
>
> so pol is 1.5G here, with 73 categories (I forgot to set.seed()). Only 
> raster and sp are present here. The arithmetic is:
>
>> object.size(slot(pol, "polygons")[[1]])
> 2896 bytes
>> dim(pol)
> [1] 500000      1
>> object.size(slot(pol, "polygons")[[1]])*dim(pol)[1]
> 1.448e+09 bytes
>
> so the input SpatialPolygons object is already large, and GEOS needs a 
> separate copy in its format, plus working copies.
>
> Could you work on tiles of the raster, then join those?
>
> We're still hoping that someone will help with bug-fixing in rgeos, 
> but this is also a data representation question, I think.
>
> Hope this helps,
>
> Roger
>
>>
>> Thanks,
>> Paul
>>
>> On 04/01/2012 8:57 PM, pgalpern wrote:
>>> Hello!
>>>
>>> Not sure if this is the best place to ask what may ultimately be an 
>>> rgeos question.
>>>
>>> I am running the latest versions of the raster and rgeos packages 
>>> under 64bit 2.14.1 on Windows 2008R2 Server with 12GB RAM and having 
>>> some challenges with memory.
>>>
>>> I am turning rasters (approx 500 x 1000 cells) into 1500 
>>> SpatialPolygons, representing each of the feature classes. It works 
>>> as it should, using rasterToPolygons(x, dissolve=T) but memory 
>>> overhead is sky high.
>>>
>>> For example, a single instance of this function quickly consumes 2-3 
>>> GB and would probably consume more if other instances were not also 
>>> running simultaneously.   As a result disk swapping occurs which 
>>> slows everything down.  Interestingly, the input raster and output 
>>> SpatialPolygons objects are only megabytes in size.  Running this 
>>> under 32bit R doesn't seem to help and occasionally results in 
>>> memory allocation failures.
>>>
>>> Finally, deleting the raster and polygons objects when the function 
>>> is complete and running gc() does not seem to release the memory.  
>>> Instead the entire R instance needs to be terminated.
>>>
>>> Can anyone tell me if this is expected behaviour ... or perhaps 
>>> suggest a more memory efficient approach.
>>>
>>> Thank you,
>>> Paul
>>>
>>
>>
>

-- 
Paul Galpern, PhD Candidate
Natural Resources Institute
70 Dysart Road
University of Manitoba
Winnipeg, Manitoba, Canada R3T 2M6
http://borealscape.ca



More information about the R-sig-Geo mailing list