[R-sig-Geo] GDAL.close
Roger Bivand
Roger.Bivand at nhh.no
Mon Nov 4 23:17:40 CET 2013
On Mon, 4 Nov 2013, Roger Bivand wrote:
> Which Windows? XP, Vista, 7, 8, 8.1? 32 and 64 bit? If Vista/7/8, run as
> administrator or not? I agree that the code in those parts of rgdal is not
> well-designed - it was well-designed, but has been modified so that it works
> for most people cross-platform, and has had to accommodate changes that have
> taken place in GDAL over more than 10 years, not least the error-handler.
>
> The simple solution to your practical problem is to for you to use a larger
> temporary drive under Windows, or change to an operating system that does not
> have these side-effects.
>
> Assisting you is not just a matter of doing what you think works for you, but
> making sure it doesn't break anything else for anybody else cross-platform.
>
> Your script does not check for other files in tempdir, so I prepended a
> listing of prior content:
>
> pc <- dir(tempdir())
>
> and dropped them from the list for unlinking:
>
> now <- dir(tempdir())
> unlink(paste(tempdir(), now[!(now %in% pc)], sep=.Platform$file.sep))
>
> I do not see how your script exercises the problem. It creates a new
> transient file, but does not close it, which was the behaviour you are
> unhappy with. If I add
>
> GDAL.close(r3)
>
> on Linux, the transient dataset is removed. On Windows 7 64-bit with the CRAN
> rgdal binary run as user, temporary files are left in tempdir for r1, r2, and
> r3. The same three temporary files are left when run as administrator.
>
> The earliest version of GDAL.close was:
>
> GDAL.close <- function(dataset) {
> .setCollectorFun(slot(dataset, 'handle'), NULL)
> .Call('RGDAL_CloseDataset', dataset, PACKAGE="rgdal")
> invisible()
> }
>
> with a version in 2007 in the THK branch calling a closeDataset method,
> containing:
>
> handle <- slot(dataset, "handle")
> unreg.finalizer(handle)
> .Call("RGDAL_DeleteHandle", handle, PACKAGE="rgdal")
>
> with:
>
> unreg.finalizer <- function(obj) reg.finalizer(obj, function(x) x)
>
> and by 2010 was:
>
> GDAL.close <- function(dataset) {
> .setCollectorFun(slot(dataset, 'handle'), NULL)
> .Call('RGDAL_CloseDataset', dataset, PACKAGE="rgdal")
> invisible(gc())
> }
>
> Special handling of GDALTransientDataset was added in revision 433 in Janual
> 2013, and modified in revision 462 in April 2013.
>
> It has seemed IIRC that Windows can treat arbitrary files as open. It is also
> possible that there is an interaction between Windows and
> rgdal:::.setCollectorFun(), which does what it should, when given the NULL
> argument, setting:
>
> .setCollectorFun <- function(object, fun) {
>
> if (is.null(fun)) fun <- function(obj) obj
> reg.finalizer(object, fun, onexit=TRUE)
>
> }
>
> so incorporating the THK branch logic. It could possibly also vary across
> drivers, so finding a robust fix means setting up a test rig with multiple
> Windows machines and testing for multiple drivers to see why some temprary
> files are being treated as open when other operating systems don't have
> problems in their removal. Windows users with too small temporary
> directories. I welcome contributions from people who understand Windows and
> can actually explain why we see the consequences we see.
>
> One candidate may be to branch to .Call("RGDAL_DeleteHandle", handle,
> PACKAGE="rgdal") for the GDALTransientDataset case; I'll report back once the
> package has gone through win-builder.
The Windows binary with this modification is at:
http://win-builder.r-project.org/TO95cIM24UVL
but I do not see that it has altered behaviour under Windows 7 running as
user. For some reason Windows sees the transient files as open. Please try
other drivers to ensure that this isn't driver-specific.
Roger
>
> Hope this doesn't muddle too much, clarification doesn't seem like the right
> expression.
>
> Roger
>
>
> On Sun, 3 Nov 2013, Oliver Soong wrote:
>
>> I've been using the CRAN rgdal and raster. I apologize in advance for all
>> the linebreaks that will be broken. This code should highlight the problem
>> and the fix:
>>
>>
>>
>> require(rgdal)
>> require(raster)
>> r1 <- raster(system.file("external/test.grd", package="raster"))
>> r2 <- as(r1, "SpatialGridDataFrame")
>> r2.dims <- gridparameters(r2)$cells.dim
>> r3 <- new("GDALTransientDataset", driver = new("GDALDriver", "GTiff"), rows
>> = r2.dims[2], cols = r2.dims[1], bands = 1, type = "Float32", options =
>> NULL, fname = file.path(tempdir(), "r3.tif"), handle = NULL)
>> print(dir(tempdir()))
>> writeRaster(r1, file.path(tempdir(), "r1.tif"))
>> writeGDAL(r2, file.path(tempdir(), "r2.tif"))
>> print(dir(tempdir()))
>> unlink(dir(tempdir(), full.names = TRUE))
>> print(dir(tempdir()))
>> leftover <- gsub("/", "\\\\", dir(tempdir(), full.names = TRUE))
>> invisible(lapply(paste("cmd /c del", leftover), system))
>> rm(r1, r2, r3)
>> gc()
>> unlink(dir(tempdir(), full.names = TRUE))
>> print(dir(tempdir()))
>> invisible(lapply(paste("cmd /c del", leftover), system))
>>
>>
>>
>> Basically, I'm trying to write a standard raster package raster (r1) and an
>> sp package SpatialGridDataFrame (r2). Both of those end up calling
>> new("GDALTransientDataset"), hence r3. At the first print(dir(tempdir())),
>> only r3 has an open temporary file, which is expected. At the second, all
>> three have open temporary files, and r1 and r2 have their written final
>> outputs, which are closed. The temporary files for r1 and r2 should have
>> been closed at this point. None of the temporary files can be removed by
>> unlink, although the final outputs can, as shown at the third
>> print(dir(tempdir())). Windows can't remove them, either. However, if I
>> remove the GDALTransientDataset r3 and initiate gc(), R can remove that
>> temporary file, but this does not work for r1 and r2. After q(), the
>> tempdir() will not be removed by R, but it and the temporary files for r1
>> and r2 can now be removed.
>>
>> It looks like GDAL.close is broken (again/as always), but the collector
>> function for GDALTransientDataset seems to at least close the handle.
>> GDAL.close relies on RGDAL_CloseDataset, whereas the GDALTransientDataset
>> collector just uses RGDAL_CloseHandle. With the handle closed, I think the
>> unlink code in GDAL.close will work (as an aside, I'd use the pattern
>> paste0("^[a-z]{3}", basen, "$") to be safer and the argument full.names
>> might be simpler than constructing flf separately). I believe
>> RGDAL_CloseDataset checks for NULL handles but just returns early, so it
>> should be the same to replace the .Call("RGDAL_CloseDataset", ...) with
>> .Call("RGDAL_CloseHandle", ...).
>>
>> Really, I think RGDAL_DeleteHandle needs to be fixed, but I don't know
>> enough about GDALDeleteDataset or the #ifndef OSGEO4W deleteFile business
>> or why RGDAL_CloseHandle is commented out to make any useful suggestions
>> there.
>>
>> Cheers,
>> Oliver
>>
>>
>>
>>
>> On Fri, Nov 1, 2013 at 1:41 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>>
>>> On Mon, 28 Oct 2013, Oliver Soong wrote:
>>>
>>> I've had a long standing struggle with GDAL.close on Windows, and I
>>>> think I might finally have found a fix. I'm currently running rgdal
>>>> 0.8.11, R 3.0.2, and 32-bit Windows 7.
>>>>
>>>> Currently, writeRaster and writeGDAL create temporary files in the
>>>> tempdir() folder (the final filename prefixed with 3 random [a-z]
>>>> letters). On my system, these files get left open and orphaned. When
>>>> doing heavy processing, this can lead to the drive hosting the
>>>> tempdir() folder to become full, even if the data is being ultimately
>>>> written to a much larger drive. This also means that R cannot clean
>>>> up these files or the tempdir() folder when it closes, causing similar
>>>> bloat in my %TEMP%.
>>>>
>>>> I haven't tested this on other platforms, but I think it might help to
>>>> insert an extra line into GDAL.close:
>>>>
>>>> .setCollectorFun(slot(dataset, "handle"), NULL)
>>>> .Call("RGDAL_CloseHandle", dataset at handle, PACKAGE = "rgdal")
>>>> .Call("RGDAL_CloseDataset", dataset, PACKAGE = "rgdal")
>>>>
>>>> For whatever reason, RGDAL_CloseDataset doesn't seem to actually close
>>>> the C file handle, but it doesn't seem to mind if the file handle was
>>>> closed beforehand.
>>>>
>>>
>>> Could you please provide a working example? I have looked at this, but
>>> need a baseline to know whether I'm looking at the same thing. I'm very
>>> unsure that this is a robust solution, and need an instrumented example,
>>> including listings of the temporary directory during the process, to see
>>> the consequences. Thanks for looking into this, but I'd prefer to be sure
>>> that a Windows-specific fix doesn't make things worse for others too.
>>> Please also report on the source of your Windows rgdal binary - is it from
>>> CRAN or locally built dynamically linking your own GDAL?
>>>
>>> Best wishes,
>>>
>>> Roger
>>>
>>>
>>>> Cheers,
>>>> Oliver
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>
>>>>
>>> --
>>> Roger Bivand
>>> Department of Economics, NHH Norwegian School of Economics,
>>> Helleveien 30, N-5045 Bergen, Norway.
>>> voice: +47 55 95 93 55; fax +47 55 95 95 43
>>> e-mail: Roger.Bivand at nhh.no
>>>
>>>
>>
>
>
--
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list