[R-sig-Geo] Memory management with rasterToPolygons (raster and rgeos)
Roger Bivand
Roger.Bivand at nhh.no
Thu Jan 5 19:19:13 CET 2012
On Thu, 5 Jan 2012, pgalpern wrote:
> I did some further research into my own question when I twigged to the
> idea that this might be a memory leak with the GEOS library.
>
> It seems likely that is, and has been documented in this forum this
> October past:
> https://mailman.stat.ethz.ch/pipermail/r-sig-geo/2011-October/013289.html
>
> As of October there didn't appear to be any real resolution to the problem,
> except - perhaps - to run rgeos under Linux.
>
> Is this the status quo?
The issue with rgeos/GEOS is unresolved, and has led to at least two
releases in the mean time. Using Linux does not help. It may be possible
to run with dissolve=FALSE, and step through chunks of feature classes, in
separate R scripts. However, it isn't just an rgeos issue, as:
library(raster)
r <- raster(nrow=500, ncol=1000)
r[] <- rpois(ncell(r), lambda=70)
pol <- rasterToPolygons(r, dissolve=FALSE)
gives me:
> object.size(r)
4011736 bytes
> object.size(pol)
1458003216 bytes
so pol is 1.5G here, with 73 categories (I forgot to set.seed()). Only
raster and sp are present here. The arithmetic is:
> object.size(slot(pol, "polygons")[[1]])
2896 bytes
> dim(pol)
[1] 500000 1
> object.size(slot(pol, "polygons")[[1]])*dim(pol)[1]
1.448e+09 bytes
so the input SpatialPolygons object is already large, and GEOS needs a
separate copy in its format, plus working copies.
Could you work on tiles of the raster, then join those?
We're still hoping that someone will help with bug-fixing in rgeos, but
this is also a data representation question, I think.
Hope this helps,
Roger
>
> Thanks,
> Paul
>
> On 04/01/2012 8:57 PM, pgalpern wrote:
>> Hello!
>>
>> Not sure if this is the best place to ask what may ultimately be an rgeos
>> question.
>>
>> I am running the latest versions of the raster and rgeos packages under
>> 64bit 2.14.1 on Windows 2008R2 Server with 12GB RAM and having some
>> challenges with memory.
>>
>> I am turning rasters (approx 500 x 1000 cells) into 1500 SpatialPolygons,
>> representing each of the feature classes. It works as it should, using
>> rasterToPolygons(x, dissolve=T) but memory overhead is sky high.
>>
>> For example, a single instance of this function quickly consumes 2-3 GB and
>> would probably consume more if other instances were not also running
>> simultaneously. As a result disk swapping occurs which slows everything
>> down. Interestingly, the input raster and output SpatialPolygons objects
>> are only megabytes in size. Running this under 32bit R doesn't seem to
>> help and occasionally results in memory allocation failures.
>>
>> Finally, deleting the raster and polygons objects when the function is
>> complete and running gc() does not seem to release the memory. Instead the
>> entire R instance needs to be terminated.
>>
>> Can anyone tell me if this is expected behaviour ... or perhaps suggest a
>> more memory efficient approach.
>>
>> Thank you,
>> Paul
>>
>
>
--
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list