[R-sig-Geo] Memory management with rasterToPolygons (raster and rgeos)
Roger.Bivand at nhh.no
Thu Jan 5 19:19:13 CET 2012
On Thu, 5 Jan 2012, pgalpern wrote:
> I did some further research into my own question when I twigged to the
> idea that this might be a memory leak with the GEOS library.
> It seems likely that is, and has been documented in this forum this
> October past:
> As of October there didn't appear to be any real resolution to the problem,
> except - perhaps - to run rgeos under Linux.
> Is this the status quo?
The issue with rgeos/GEOS is unresolved, and has led to at least two
releases in the mean time. Using Linux does not help. It may be possible
to run with dissolve=FALSE, and step through chunks of feature classes, in
separate R scripts. However, it isn't just an rgeos issue, as:
r <- raster(nrow=500, ncol=1000)
r <- rpois(ncell(r), lambda=70)
pol <- rasterToPolygons(r, dissolve=FALSE)
so pol is 1.5G here, with 73 categories (I forgot to set.seed()). Only
raster and sp are present here. The arithmetic is:
> object.size(slot(pol, "polygons")[])
 500000 1
> object.size(slot(pol, "polygons")[])*dim(pol)
so the input SpatialPolygons object is already large, and GEOS needs a
separate copy in its format, plus working copies.
Could you work on tiles of the raster, then join those?
We're still hoping that someone will help with bug-fixing in rgeos, but
this is also a data representation question, I think.
Hope this helps,
> On 04/01/2012 8:57 PM, pgalpern wrote:
>> Not sure if this is the best place to ask what may ultimately be an rgeos
>> I am running the latest versions of the raster and rgeos packages under
>> 64bit 2.14.1 on Windows 2008R2 Server with 12GB RAM and having some
>> challenges with memory.
>> I am turning rasters (approx 500 x 1000 cells) into 1500 SpatialPolygons,
>> representing each of the feature classes. It works as it should, using
>> rasterToPolygons(x, dissolve=T) but memory overhead is sky high.
>> For example, a single instance of this function quickly consumes 2-3 GB and
>> would probably consume more if other instances were not also running
>> simultaneously. As a result disk swapping occurs which slows everything
>> down. Interestingly, the input raster and output SpatialPolygons objects
>> are only megabytes in size. Running this under 32bit R doesn't seem to
>> help and occasionally results in memory allocation failures.
>> Finally, deleting the raster and polygons objects when the function is
>> complete and running gc() does not seem to release the memory. Instead the
>> entire R instance needs to be terminated.
>> Can anyone tell me if this is expected behaviour ... or perhaps suggest a
>> more memory efficient approach.
>> Thank you,
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo