[R-sig-Geo] Memory management with rasterToPolygons (raster and rgeos)
pgalpern
pgalpern at gmail.com
Thu Jan 5 19:49:57 CET 2012
Agreed these are very large objects. I'll look at tiling as a general
solution to this problem.
For others facing the same challenge it is worth noting that I have been
successful in running rasterToPolygons(x, dissolve=TRUE) on rasters up
to 800000 cells producing an object containing approx 1500
SpatialPolygons under 64bit Windows, by ensuring there is at least 7GB
of overhead memory. Run time was reasonable. R instance must be
terminated following function call to free up the memory.
On 05/01/2012 12:19 PM, Roger Bivand wrote:
> On Thu, 5 Jan 2012, pgalpern wrote:
>
>> I did some further research into my own question when I twigged to
>> the idea that this might be a memory leak with the GEOS library.
>>
>> It seems likely that is, and has been documented in this forum this
>> October past:
>> https://mailman.stat.ethz.ch/pipermail/r-sig-geo/2011-October/013289.html
>>
>> As of October there didn't appear to be any real resolution to the
>> problem, except - perhaps - to run rgeos under Linux.
>>
>> Is this the status quo?
>
> The issue with rgeos/GEOS is unresolved, and has led to at least two
> releases in the mean time. Using Linux does not help. It may be
> possible to run with dissolve=FALSE, and step through chunks of
> feature classes, in separate R scripts. However, it isn't just an
> rgeos issue, as:
>
> library(raster)
> r <- raster(nrow=500, ncol=1000)
> r[] <- rpois(ncell(r), lambda=70)
> pol <- rasterToPolygons(r, dissolve=FALSE)
>
> gives me:
>
>> object.size(r)
> 4011736 bytes
>> object.size(pol)
> 1458003216 bytes
>
> so pol is 1.5G here, with 73 categories (I forgot to set.seed()). Only
> raster and sp are present here. The arithmetic is:
>
>> object.size(slot(pol, "polygons")[[1]])
> 2896 bytes
>> dim(pol)
> [1] 500000 1
>> object.size(slot(pol, "polygons")[[1]])*dim(pol)[1]
> 1.448e+09 bytes
>
> so the input SpatialPolygons object is already large, and GEOS needs a
> separate copy in its format, plus working copies.
>
> Could you work on tiles of the raster, then join those?
>
> We're still hoping that someone will help with bug-fixing in rgeos,
> but this is also a data representation question, I think.
>
> Hope this helps,
>
> Roger
>
>>
>> Thanks,
>> Paul
>>
>> On 04/01/2012 8:57 PM, pgalpern wrote:
>>> Hello!
>>>
>>> Not sure if this is the best place to ask what may ultimately be an
>>> rgeos question.
>>>
>>> I am running the latest versions of the raster and rgeos packages
>>> under 64bit 2.14.1 on Windows 2008R2 Server with 12GB RAM and having
>>> some challenges with memory.
>>>
>>> I am turning rasters (approx 500 x 1000 cells) into 1500
>>> SpatialPolygons, representing each of the feature classes. It works
>>> as it should, using rasterToPolygons(x, dissolve=T) but memory
>>> overhead is sky high.
>>>
>>> For example, a single instance of this function quickly consumes 2-3
>>> GB and would probably consume more if other instances were not also
>>> running simultaneously. As a result disk swapping occurs which
>>> slows everything down. Interestingly, the input raster and output
>>> SpatialPolygons objects are only megabytes in size. Running this
>>> under 32bit R doesn't seem to help and occasionally results in
>>> memory allocation failures.
>>>
>>> Finally, deleting the raster and polygons objects when the function
>>> is complete and running gc() does not seem to release the memory.
>>> Instead the entire R instance needs to be terminated.
>>>
>>> Can anyone tell me if this is expected behaviour ... or perhaps
>>> suggest a more memory efficient approach.
>>>
>>> Thank you,
>>> Paul
>>>
>>
>>
>
--
Paul Galpern, PhD Candidate
Natural Resources Institute
70 Dysart Road
University of Manitoba
Winnipeg, Manitoba, Canada R3T 2M6
http://borealscape.ca
More information about the R-sig-Geo
mailing list