[R-sig-Geo] Memory management with rasterToPolygons (raster and rgeos)

Roger Bivand Roger.Bivand at nhh.no
Thu Jan 5 19:19:13 CET 2012


On Thu, 5 Jan 2012, pgalpern wrote:

> I did some further research into my own question when I twigged to the 
> idea that this might be a memory leak with the GEOS library.
>
> It seems likely that is, and has been documented in this forum this 
> October past: 
> https://mailman.stat.ethz.ch/pipermail/r-sig-geo/2011-October/013289.html
>
> As of October there didn't appear to be any real resolution to the problem, 
> except - perhaps - to run rgeos under Linux.
>
> Is this the status quo?

The issue with rgeos/GEOS is unresolved, and has led to at least two 
releases in the mean time. Using Linux does not help. It may be possible 
to run with dissolve=FALSE, and step through chunks of feature classes, in 
separate R scripts. However, it isn't just an rgeos issue, as:

library(raster)
r <- raster(nrow=500, ncol=1000)
r[] <- rpois(ncell(r), lambda=70)
pol <- rasterToPolygons(r, dissolve=FALSE)

gives me:

> object.size(r)
4011736 bytes
> object.size(pol)
1458003216 bytes

so pol is 1.5G here, with 73 categories (I forgot to set.seed()). Only 
raster and sp are present here. The arithmetic is:

> object.size(slot(pol, "polygons")[[1]])
2896 bytes
> dim(pol)
[1] 500000      1
> object.size(slot(pol, "polygons")[[1]])*dim(pol)[1]
1.448e+09 bytes

so the input SpatialPolygons object is already large, and GEOS needs a 
separate copy in its format, plus working copies.

Could you work on tiles of the raster, then join those?

We're still hoping that someone will help with bug-fixing in rgeos, but 
this is also a data representation question, I think.

Hope this helps,

Roger

>
> Thanks,
> Paul
>
> On 04/01/2012 8:57 PM, pgalpern wrote:
>> Hello!
>> 
>> Not sure if this is the best place to ask what may ultimately be an rgeos 
>> question.
>> 
>> I am running the latest versions of the raster and rgeos packages under 
>> 64bit 2.14.1 on Windows 2008R2 Server with 12GB RAM and having some 
>> challenges with memory.
>> 
>> I am turning rasters (approx 500 x 1000 cells) into 1500 SpatialPolygons, 
>> representing each of the feature classes. It works as it should, using 
>> rasterToPolygons(x, dissolve=T) but memory overhead is sky high.
>> 
>> For example, a single instance of this function quickly consumes 2-3 GB and 
>> would probably consume more if other instances were not also running 
>> simultaneously.   As a result disk swapping occurs which slows everything 
>> down.  Interestingly, the input raster and output SpatialPolygons objects 
>> are only megabytes in size.  Running this under 32bit R doesn't seem to 
>> help and occasionally results in memory allocation failures.
>> 
>> Finally, deleting the raster and polygons objects when the function is 
>> complete and running gc() does not seem to release the memory.  Instead the 
>> entire R instance needs to be terminated.
>> 
>> Can anyone tell me if this is expected behaviour ... or perhaps suggest a 
>> more memory efficient approach.
>> 
>> Thank you,
>> Paul
>> 
>
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list