[Rd] proper use of reg.finalizer to close connections

Henrik Bengtsson hb at biostat.ucsf.edu
Mon Oct 27 06:02:48 CET 2014


On Sun, Oct 26, 2014 at 8:14 PM, Murat Tasan <mmuurr at gmail.com> wrote:
> Ah (again)!
> Even with my fumbling presentation of the issue, you gave me the hint
> that solved it, thanks!
>
> Yes, the reg.finalizer call needs to be wrapped in an .onLoad hook so
> it's not called once during package installation and then never again.
> And once I switched to using ls() (instead of names()), everything
> works as expected.
>
> So, the package code effectively looks like so:
>
> .CONNS <- new.env(parent = emptyenv())
> .onLoad <- function(libname, pkgname) {
>     reg.finalizer(.CONNS, function(x) sapply(ls(x), .disconnect))
> }
> .disconnect <- function(x) {
>     ## handle disconnection of .CONNS[[x]] here
> }

In your example above, I would be concerned about what happens if you
detach/unload your package, because then you're finalizer is still
registered and will be called whenever '.CONNS' is being garbage
collector (or there after).  However, the finalizer function calls
.disconnect(), which is no longer available.

Finalizers should be used with great care, because you're not in
control in what order things are occurring and what "resources" are
around when the finalizer function is eventually called and when it is
called.  I've been bitten by this a few times and it can be very hard
to reproduce and troubleshoot such bugs.  See also the 'Note' of
?reg.finalizer.

My $.02

/Henrik

>
> Cheers and thanks!
>
> -m
>
>
>
>
> On Sun, Oct 26, 2014 at 8:53 PM, Gábor Csárdi <csardi.gabor at gmail.com> wrote:
>> Well, to be honest I don't understand fully what you are trying to do.
>> If you want to run code when the package is detached or when it is
>> unloaded, then use a hook:
>> http://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Load-hooks
>>
>> If you want to run code when an object is freed, then use a finalizer.
>>
>> Note that when you install a package, R runs all the code in the
>> package and only stores the results of the code in the installed
>> package. So if you create an object outside of a function in your
>> package, then only the object will be stored in the package, but not
>> the code that creates it. The object will be simply loaded when you
>> load the package, but it will not be re-created.
>>
>> Now, I am not sure what happens if you set the finalizer on such an
>> object in the package. I can imagine that the finalizer will not be
>> saved into the package, and is only used once, when
>> building/installing the package. In this case you'll need to set the
>> finalizer in .onLoad().
>>
>> Gabor
>>
>> On Sun, Oct 26, 2014 at 10:35 PM, Murat Tasan <mmuurr at gmail.com> wrote:
>>> Ah, thanks for the ls() vs names() tip!
>>> (But sadly, it didn't solve the issue... )
>>>
>>> So, after some more tinkering, I believe the finalizer is being called
>>> _sometimes_.
>>> I changed the reg.finalizer(...) call to just this:
>>>
>>> reg.finalizer(.CONNS, function(x) print("foo"), onexit  = TRUE)
>>>
>>> Now, when I load the package and detach(..., unload = TRUE), nothing prints.
>>> And when I quit, nothing prints.
>>>
>>> If I, however, create an environment on the workspace, like so:
>>>> e <- new.env(parent = emptyenv())
>>>> reg.finalizer(e, function(x) print("bar"), onexit = TRUE)
>>> When I quit (or rm(e)), "bar" is printed.
>>> But no "foo" (corresponding to same sequence of code, just in the
>>> package instead).
>>>
>>> BUT(!), when I _install_ the package, "foo" is printed at the end of
>>> the "**testing if installed package can be loaded" installation
>>> segment.
>>> So, somehow the R script that tests for package loading/unloading is
>>> triggering the finalizer (which is good).
>>> Yet, I cannot seem to trigger it myself when either quitting or
>>> forcing a package unload (which is bad).
>>>
>>> Any ideas why the installation script would successfully trigger a
>>> finalizer while standard unloading or quitting wouldn't?
>>>
>>> Cheers and thanks!
>>>
>>> -m
>>>
>>> On Sun, Oct 26, 2014 at 8:03 PM, Gábor Csárdi <csardi.gabor at gmail.com> wrote:
>>>> Hmmm, I guess you will want to put the actual objects that represent
>>>> the connections into the environment, at least this seems to be the
>>>> easiest to me. Btw. you need ls() to list the contents of an
>>>> environment, instead of names(). E.g.
>>>>
>>>> e <- new.env()
>>>> e$foo <- 10
>>>> e$bar <- "aaa"
>>>> names(e)
>>>> #> NULL
>>>> ls(e)
>>>> #> [1] "bar" "foo"
>>>> reg.finalizer(e, function(x) { print(ls(x)) })
>>>> #> NULL
>>>> rm(e)
>>>> gc()
>>>> #> [1] "bar" "foo"
>>>> #>           used (Mb) gc trigger  (Mb) max used  (Mb)
>>>> #> Ncells 1528877 81.7    2564037 137.0  2564037 137.0
>>>> #> Vcells 3752538 28.7    7930384  60.6  7930356  60.6
>>>>
>>>> More precisely, you probably want to represent each connection as a
>>>> separate environment, with its own finalizer. Hope this helps,
>>>> Gabor
>>>>
>>>> On Sun, Oct 26, 2014 at 9:49 PM, Murat Tasan <mmuurr at gmail.com> wrote:
>>>>> Hi all, I have a question about finalizers...
>>>>> I have a package that manages state for a few connections, and I'd
>>>>> like to ensure that these connections are 'cleanly' closed upon either
>>>>> (i) R quitting or (ii) an unloading of the package.
>>>>> So, in a pared-down example package with a single R file, it looks
>>>>> something like:
>>>>>
>>>>> ##### BEGIN PACKAGE CODE #####
>>>>> .CONNS <- new.env(parent = emptyenv())
>>>>> .CONNS$resource1 <- NULL
>>>>> .CONNS$resource2 <- NULL
>>>>> ## some more .CONNS resources...
>>>>>
>>>>> reg.finalizer(.CONNS, function(x) sapply(names(x), disconnect), onexit = TRUE)
>>>>>
>>>>> connect <- function(x) {
>>>>>   ## here lies code to connect and update .CONNS[[x]]
>>>>> }
>>>>> disconnect <- function(x) {
>>>>>   print(sprintf("disconnect(%s)", x))
>>>>>   ## here lies code to disconnect and update .CONNS[[x]]
>>>>> }
>>>>> ##### END PACKAGE CODE #####
>>>>>
>>>>> The print(...) statement in disconnect(...) is there as a trace, as I
>>>>> hoped that I'd see disconnect(...) being called when I quit (or
>>>>> detach(..., unload = TRUE)).
>>>>> But, it doesn't appear that disconnect(...) is ever called when the
>>>>> package (and .CONNS) falls out of memory/scope (and I ran gc() after
>>>>> detach(...), just to be sure).
>>>>>
>>>>> In a second 'shot-in-the-dark' attempt, I placed the reg.finalizer
>>>>> call inside an .onLoad function, but that didn't seem to work, either.
>>>>>
>>>>> I'm guessing my use of reg.finalizer is way off-base here... but I
>>>>> cannot infer from the reg.finalizer man page what I might be doing
>>>>> wrong.
>>>>> Is there a way to see, at the R-system level, what functions have been
>>>>> registered as finalizers?
>>>>>
>>>>> Thanks for any pointers!
>>>>>
>>>>> -Murat
>>>>>
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list