[R-pkg-devel] New test in R-devel causes existing packages to fail: "Error: connections left open"

Henrik Bengtsson henrik@bengt@@on @ending from gm@il@com
Thu Aug 23 23:19:42 CEST 2018


Does R CMD check --as-cran test for newly opened connections or any
open connections?  Could the check for stray connection in
examples/vignettes be:

1. Record what connections are open
2. Attach the package
3. Record what connections are open
4. Run the example
5. Assert that no *new* connections in addition to what's recorded in
Step 3 are open
6. Unload the package
7. Assert that no *new* connections in addition to what's recorded in
Step 1 are open

Step 5 asserts that the code in the example does not leave stray
connections behind, and Step 7 that the package itself does not leave
stray connections behind.

/Henrik
On Thu, Aug 23, 2018 at 1:25 PM David B. Dahl <dahl using stat.byu.edu> wrote:
>
> Oops, I accidentally did not "reply-all".... Here is my message:
>
> Thanks Uwe, Duncan, and Gabor for the response, advise, and flexibility.
>
> Regarding Uwe's suggestion:  "... there should be a function that
> creates the connction and one that closes the connection," I should
> clarify.  The rscala package does just that.  There is a function
> (named "scala") that creates the connection (using delayedAssign) and
> another the closes the function (namely an S3 close method).  The
> examples for the rscala package do this full open/close semantics,
> but...
>
> The problem comes when authors of another package, let's call it the
> "FooBar" package, want to implement an algorithm in Scala based on
> functionality provided by the rscala package.  Let's say they write a
> function called "neatAlgorithm" based on Scala.  Yes, the FooBar
> package author could require that, before the user calls the
> "neatAlgorithm" function, they first call a function to set up the
> connection (which itself would call the "rscala::scala" function) and
> then, after calling the "neatAlgorithm" function, they call a function
> to close the connection.
>
> But that is not very user friendly and exposes the user to
> implementation details of the algorithm.  The user of the FooBar
> package don't really care whether the "neatAlgorithm" is implemented
> in pure R, C++, Scala, or whatever, much like the users of the 'lm'
> function don't need to know the implementation details or do any setup
> before and after calling the function.
>
> The current approach is that the connection to Scala is transparent to
> the end user of a package.  Behind the scenes, the package author
> establish the connection once it is needed and the rscala package
> manages the connection and explicitly closes it when 1. the package is
> unloaded or 2. the R session ends.  This approach does not leave
> dangling connections  --- which I believe is the point of the new test
> --- yet my package is caught up in the test.
>
> I hope that this approach is still valid.  Perhaps the test could
> result in a warning (instead of an error) and CRAN could accept
> packages with such a warning.
>
> If not, a work-around is to have a \dontshow section in the examples
> that will close the connection (but leave the Scala process running)
> and then automatically reestablish the connection as needed.  This
> would not be very efficient but, as Duncan mentioned, it mostly only
> effects the package examples themselves.  Plus, it would not be too
> burdensome for package developers.
>
> Again, thanks for considering my situation.
>
> Best regards,
>
> -- David
>
> On Mon, Aug 20, 2018 at 11:11 PM Uwe Ligges
> <ligges using statistik.tu-dortmund.de> wrote:
> >
> > My advise:
> >
> > Apparently you want to have communication via sockets to scala.
> >
> > So there should be a function that creates the connction and one tha
> > closes the connection.
> > Comparable to starting some parallel cluster and stopping it again.
> >
> > In the meantime, you can allow for all sorts of communication.
> >
> > So that's fine.
> >
> > Then in your examples, simply design them to be standalone, i.e. in
> > *your* examples always start the connection and stop it again at the end
> > of one examples block, i.e. the exampels defined in one Rd file.
> >
> > Best,
> > Uwe Ligges
> >
> >
> >
> > On 20.08.2018 02:11, Duncan Murdoch wrote:
> > > On 19/08/2018 12:34 PM, Gábor Csárdi wrote:
> > >> Sorry, missed that these were examples, so, yeah, that's harder.  G.
> > >
> > > How about a function that checks if the connection is open before doing
> > > anything, and then at the end you close it if it wasn't already open?
> > > This will make all examples run slower on CRAN, but won't affect most
> > > users who are doing their own stuff as well as running examples.
> > >
> > > Or, how about the startup code for the package opens the connection?
> > >
> > > Or perhaps CRAN will respond to this thread with another suggestion.
> > >
> > > Duncan Murdoch
> > >
> > >
> > >> On Sun, Aug 19, 2018 at 6:32 PM Gábor Csárdi <csardi.gabor using gmail.com>
> > >> wrote:
> > >>>
> > >>> You could just create a function to close the connection and then
> > >>> people could call it at the end of their test suites. >>
> > >>> Gabor
> > >>> On Sun, Aug 19, 2018 at 6:22 PM David B. Dahl <dahl using stat.byu.edu> wrote:
> > >>>>
> > >>>> In preparing to submit an update of my package to CRAN, I found that
> > >>>> R-devel has a new test regarding "connects left open" that my packages
> > >>>> fail.  The new test appears to have been committed by Uwe Ligges in
> > >>>> revisions 74959 and 74964 on 2018-07-14 and 2018-07-15, respectively.
> > >>>> The commit message says, "check after each example whether open
> > >>>> connections exist, indicating e.g. file connections were left open or
> > >>>> parallel clusters still running."
> > >>>>
> > >>>> I am hoping for advice on how to pass "R CMD check --as-cran".  Or,
> > >>>> perhaps my situation will prompt a change to the test or, at least,
> > >>>> having it result in a warning instead of an error.
> > >>>>
> > >>>> Below I describe the situation.  My rscala package allows developers
> > >>>> to write R packages based on Scala (much like rJava and Rcpp for Java
> > >>>> and C++, respectively).  Scala runs as a separate process and
> > >>>> interprocess communication is implemented using socket connections.
> > >>>>
> > >>>> Suppose a package using rscala has functions that call Scala code.
> > >>>> (Such packages are 'bamboo', 'sdols', and 'shallot' on CRAN.)  The
> > >>>> first time a user executes an R function calling down into Scala, a
> > >>>> socket connect between Scala and R is established.  For the sake of
> > >>>> low latency, after the call to the function ends, the connection stays
> > >>>> open until the package is unloaded or the R session ends.  But, this
> > >>>> approach runs afoul of the new test mentioned above that appears to be
> > >>>> designed to catch connections that are *accidentally* left open.
> > >>>>
> > >>>> I definitely do not want to users of my packages 'bamboo', 'sdols',
> > >>>> and 'shallot' to have to think about managing connection between Scala
> > >>>> and R.  That's an implementation detail and uing the package should be
> > >>>> transparent for the user (who doesn't care about the implementation
> > >>>> details).
> > >>>>
> > >>>> On my end, I see two solutions:  1. I could try to reengineer my
> > >>>> approach --- establishing a new connection for every single call into
> > >>>> Scala --- although I am loath to do anything to increase the latency,
> > >>>> or 2. I could wrap all the examples in \donttest so that CRAN checks
> > >>>> are passed.
> > >>>>
> > >>>> Or, again, perhaps my situation will prompt a reevaluation of the
> > >>>> test.  Perhaps it could result in a warning (instead of an error) and
> > >>>> the CRAN maintainers would accept packages with such a warning.
> > >>>>
> > >>>> Any advise?  Thanks a lot!
> > >>>>
> > >>>> -- David
> > >>>>
> > >>>> ______________________________________________
> > >>>> R-package-devel using r-project.org mailing list
> > >>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > >>
> > >> ______________________________________________
> > >> R-package-devel using r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > >>
> > >
> > > ______________________________________________
> > > R-package-devel using r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list