[Rd] proper use of reg.finalizer to close connections

Murat Tasan mmuurr at gmail.com
Mon Oct 27 07:18:21 CET 2014


Ah, good point, I hadn't thought of that detail.
Would moving reg.finalizer back outside of .onLoad and hooking it to the
package's environment itself work (more safely)?
Something like:
finalizerFunction <- ## cleanup code
reg.finalizer(parent.env(), finalizerFunction)

-m
 On Oct 26, 2014 11:03 PM, "Henrik Bengtsson" <hb at biostat.ucsf.edu> wrote:

> On Sun, Oct 26, 2014 at 8:14 PM, Murat Tasan <mmuurr at gmail.com> wrote:
> > Ah (again)!
> > Even with my fumbling presentation of the issue, you gave me the hint
> > that solved it, thanks!
> >
> > Yes, the reg.finalizer call needs to be wrapped in an .onLoad hook so
> > it's not called once during package installation and then never again.
> > And once I switched to using ls() (instead of names()), everything
> > works as expected.
> >
> > So, the package code effectively looks like so:
> >
> > .CONNS <- new.env(parent = emptyenv())
> > .onLoad <- function(libname, pkgname) {
> >     reg.finalizer(.CONNS, function(x) sapply(ls(x), .disconnect))
> > }
> > .disconnect <- function(x) {
> >     ## handle disconnection of .CONNS[[x]] here
> > }
>
> In your example above, I would be concerned about what happens if you
> detach/unload your package, because then you're finalizer is still
> registered and will be called whenever '.CONNS' is being garbage
> collector (or there after).  However, the finalizer function calls
> .disconnect(), which is no longer available.
>
> Finalizers should be used with great care, because you're not in
> control in what order things are occurring and what "resources" are
> around when the finalizer function is eventually called and when it is
> called.  I've been bitten by this a few times and it can be very hard
> to reproduce and troubleshoot such bugs.  See also the 'Note' of
> ?reg.finalizer.
>
> My $.02
>
> /Henrik
>
> >
> > Cheers and thanks!
> >
> > -m
> >
> >
> >
> >
> > On Sun, Oct 26, 2014 at 8:53 PM, Gábor Csárdi <csardi.gabor at gmail.com>
> wrote:
> >> Well, to be honest I don't understand fully what you are trying to do.
> >> If you want to run code when the package is detached or when it is
> >> unloaded, then use a hook:
> >> http://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Load-hooks
> >>
> >> If you want to run code when an object is freed, then use a finalizer.
> >>
> >> Note that when you install a package, R runs all the code in the
> >> package and only stores the results of the code in the installed
> >> package. So if you create an object outside of a function in your
> >> package, then only the object will be stored in the package, but not
> >> the code that creates it. The object will be simply loaded when you
> >> load the package, but it will not be re-created.
> >>
> >> Now, I am not sure what happens if you set the finalizer on such an
> >> object in the package. I can imagine that the finalizer will not be
> >> saved into the package, and is only used once, when
> >> building/installing the package. In this case you'll need to set the
> >> finalizer in .onLoad().
> >>
> >> Gabor
> >>
> >> On Sun, Oct 26, 2014 at 10:35 PM, Murat Tasan <mmuurr at gmail.com> wrote:
> >>> Ah, thanks for the ls() vs names() tip!
> >>> (But sadly, it didn't solve the issue... )
> >>>
> >>> So, after some more tinkering, I believe the finalizer is being called
> >>> _sometimes_.
> >>> I changed the reg.finalizer(...) call to just this:
> >>>
> >>> reg.finalizer(.CONNS, function(x) print("foo"), onexit  = TRUE)
> >>>
> >>> Now, when I load the package and detach(..., unload = TRUE), nothing
> prints.
> >>> And when I quit, nothing prints.
> >>>
> >>> If I, however, create an environment on the workspace, like so:
> >>>> e <- new.env(parent = emptyenv())
> >>>> reg.finalizer(e, function(x) print("bar"), onexit = TRUE)
> >>> When I quit (or rm(e)), "bar" is printed.
> >>> But no "foo" (corresponding to same sequence of code, just in the
> >>> package instead).
> >>>
> >>> BUT(!), when I _install_ the package, "foo" is printed at the end of
> >>> the "**testing if installed package can be loaded" installation
> >>> segment.
> >>> So, somehow the R script that tests for package loading/unloading is
> >>> triggering the finalizer (which is good).
> >>> Yet, I cannot seem to trigger it myself when either quitting or
> >>> forcing a package unload (which is bad).
> >>>
> >>> Any ideas why the installation script would successfully trigger a
> >>> finalizer while standard unloading or quitting wouldn't?
> >>>
> >>> Cheers and thanks!
> >>>
> >>> -m
> >>>
> >>> On Sun, Oct 26, 2014 at 8:03 PM, Gábor Csárdi <csardi.gabor at gmail.com>
> wrote:
> >>>> Hmmm, I guess you will want to put the actual objects that represent
> >>>> the connections into the environment, at least this seems to be the
> >>>> easiest to me. Btw. you need ls() to list the contents of an
> >>>> environment, instead of names(). E.g.
> >>>>
> >>>> e <- new.env()
> >>>> e$foo <- 10
> >>>> e$bar <- "aaa"
> >>>> names(e)
> >>>> #> NULL
> >>>> ls(e)
> >>>> #> [1] "bar" "foo"
> >>>> reg.finalizer(e, function(x) { print(ls(x)) })
> >>>> #> NULL
> >>>> rm(e)
> >>>> gc()
> >>>> #> [1] "bar" "foo"
> >>>> #>           used (Mb) gc trigger  (Mb) max used  (Mb)
> >>>> #> Ncells 1528877 81.7    2564037 137.0  2564037 137.0
> >>>> #> Vcells 3752538 28.7    7930384  60.6  7930356  60.6
> >>>>
> >>>> More precisely, you probably want to represent each connection as a
> >>>> separate environment, with its own finalizer. Hope this helps,
> >>>> Gabor
> >>>>
> >>>> On Sun, Oct 26, 2014 at 9:49 PM, Murat Tasan <mmuurr at gmail.com>
> wrote:
> >>>>> Hi all, I have a question about finalizers...
> >>>>> I have a package that manages state for a few connections, and I'd
> >>>>> like to ensure that these connections are 'cleanly' closed upon
> either
> >>>>> (i) R quitting or (ii) an unloading of the package.
> >>>>> So, in a pared-down example package with a single R file, it looks
> >>>>> something like:
> >>>>>
> >>>>> ##### BEGIN PACKAGE CODE #####
> >>>>> .CONNS <- new.env(parent = emptyenv())
> >>>>> .CONNS$resource1 <- NULL
> >>>>> .CONNS$resource2 <- NULL
> >>>>> ## some more .CONNS resources...
> >>>>>
> >>>>> reg.finalizer(.CONNS, function(x) sapply(names(x), disconnect),
> onexit = TRUE)
> >>>>>
> >>>>> connect <- function(x) {
> >>>>>   ## here lies code to connect and update .CONNS[[x]]
> >>>>> }
> >>>>> disconnect <- function(x) {
> >>>>>   print(sprintf("disconnect(%s)", x))
> >>>>>   ## here lies code to disconnect and update .CONNS[[x]]
> >>>>> }
> >>>>> ##### END PACKAGE CODE #####
> >>>>>
> >>>>> The print(...) statement in disconnect(...) is there as a trace, as I
> >>>>> hoped that I'd see disconnect(...) being called when I quit (or
> >>>>> detach(..., unload = TRUE)).
> >>>>> But, it doesn't appear that disconnect(...) is ever called when the
> >>>>> package (and .CONNS) falls out of memory/scope (and I ran gc() after
> >>>>> detach(...), just to be sure).
> >>>>>
> >>>>> In a second 'shot-in-the-dark' attempt, I placed the reg.finalizer
> >>>>> call inside an .onLoad function, but that didn't seem to work,
> either.
> >>>>>
> >>>>> I'm guessing my use of reg.finalizer is way off-base here... but I
> >>>>> cannot infer from the reg.finalizer man page what I might be doing
> >>>>> wrong.
> >>>>> Is there a way to see, at the R-system level, what functions have
> been
> >>>>> registered as finalizers?
> >>>>>
> >>>>> Thanks for any pointers!
> >>>>>
> >>>>> -Murat
> >>>>>
> >>>>> ______________________________________________
> >>>>> R-devel at r-project.org mailing list
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>> ______________________________________________
> >>> R-devel at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list