[R-sig-Geo] raster/rgdal- problem: Too many open files (Linux)

Roger Bivand Roger.Bivand at nhh.no
Tue Aug 6 21:35:21 CEST 2013


On Tue, 6 Aug 2013, Mauricio Zambrano-Bigiarini wrote:

> On 08/05/2013 04:37 PM, Jon Olav Skoien wrote:
>> Dear list,
>> 
>> We have a problem which appears to be a bug in either rgdal or raster,
>> although it could also be a bug in base R or in our understanding of how
>> to deal with connections.
>> 
>> We have a process which is writing a rather large (~10-20.000) number of
>> geoTiffs via writeRaster. However, the process has frequently stopped
>> with an error of the type:
>> Error in .local(.Object, ...) :
>>     TIFFOpen:/local0/skoiejo/hri/test.tif: Too many open files
>> The issue seems to be the creation of temp-files in the temp directory
>> which is given by tempdir(), not by raster:::.tmpdir(). These temp-files
>> seem to be created by the call
>>     transient <- new("GDALTransientDataset", driver=driver, rows=r at nrows,
>> cols=r at ncols, bands=nbands, type=dataformat, fname=filename,
>> options=options, handle=NULL)
>> from raster:::.getGDALtransient
>> The temp-files are deleted after writing the geoTiff, but are not
>> removed from the list of open files in Linux, which on our system was
>> limited to 1024 files (ulimit -n) per process. Below is a script which
>> can replicate the issue (takes a few minutes to reach 1024) and
>> sessionInfo().
>> 
>> Currently we are trying to solve the issue by increasing the limit of
>> file connections, but we would prefer a solution where the connections
>> are properly deleted, either before writeRaster finishes, or a command
>> which we can include in our script, either R-code or a call to System().
>> The connections are not visible via showConnections(), and
>> closeAllConnections() does not help.
>> 
>> Thanks,
>> Jon
>
> I stumbled across the same problem (with exactly the same configuration 
> reported by Jon with 'sessionInfo()'), while trying to change the values of 
> some pixels in more than 6000 maps.
>
> Thank  you very much Jon for the detailed report about the problem, which 
> helped me to find a workaround to this problem (so far, just to split the 
> 6000 maps in smaller groups).
>
>> 
>> 
>> r <- raster(system.file("external/test.grd", package="raster"))
>> for (ifile in 1:2000) {
>>     writeRaster(r, "test.tif", format = "GTiff", overwrite = TRUE)
>>     print(ifile)
>> }
>> 
>
> After trying the previous reproducible code, I don't understand why I got the 
> error when ifile=1019 and not 1024:
>
> ....
> [1] 1018
> [1] 1019
> Error in .local(.Object, ...) :
>  TIFFOpen:/home/hzambran/test.tif: Too many open files
>

There are other files opened by the R process that reduce the number 
needed. The problem is in the GDAL bindings with R, I haven't tried to see 
whether other applications keeping GDAL loaded face the same issues. GDAL 
applications typically write once and exit, so this isn't a problem there.

The current GDAL.close() code says unlink() to a vector of files with the 
same basename, but actually unlink() now appears to fail, leaving the 
files in place. Using file.remove() leads to the same result, and using 
deleteFile() provokes other problems.

This will probably turn out to be something trivial, but will take a great 
deal of time to debug, as the consequences of changing the dataset 
structure are possibly extensive.

For the time being, the work-around is the only route; if volunteers can 
debug this, progress may be possible, but everything else has to continue 
to work.

Roger

>
>
> Thanks again Jon for sharing your findings about this.
>
> All the best,
>
> Mauricio Zambrano-Bigiarini, Ph.D
>
>
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list