[R-pkg-devel] New test in R-devel causes existing packages to fail: "Error: connections left open"

David B. Dahl d@hl @ending from @t@t@byu@edu
Mon Aug 27 07:14:26 CEST 2018


Henrik,

Thanks for the suggest.  Yes, definitely, I think your more nuanced
test would be a big improvement.  The only wrinkle is that the
connection is established *not* when the package is *loaded* but
rather when the connection is *first needed* (using delayedAssign when
the package is loaded).  That way, loading the package doesn't block
the REPL for ~5 seconds while Scala and the JVM first start.

-- David

On Thu, Aug 23, 2018 at 11:19 PM Henrik Bengtsson
<henrik.bengtsson using gmail.com> wrote:
>
> Does R CMD check --as-cran test for newly opened connections or any
> open connections?  Could the check for stray connection in
> examples/vignettes be:
>
> 1. Record what connections are open
> 2. Attach the package
> 3. Record what connections are open
> 4. Run the example
> 5. Assert that no *new* connections in addition to what's recorded in
> Step 3 are open
> 6. Unload the package
> 7. Assert that no *new* connections in addition to what's recorded in
> Step 1 are open
>
> Step 5 asserts that the code in the example does not leave stray
> connections behind, and Step 7 that the package itself does not leave
> stray connections behind.
>
> /Henrik
> On Thu, Aug 23, 2018 at 1:25 PM David B. Dahl <dahl using stat.byu.edu> wrote:
> >
> > Oops, I accidentally did not "reply-all".... Here is my message:
> >
> > Thanks Uwe, Duncan, and Gabor for the response, advise, and flexibility.
> >
> > Regarding Uwe's suggestion:  "... there should be a function that
> > creates the connction and one that closes the connection," I should
> > clarify.  The rscala package does just that.  There is a function
> > (named "scala") that creates the connection (using delayedAssign) and
> > another the closes the function (namely an S3 close method).  The
> > examples for the rscala package do this full open/close semantics,
> > but...
> >
> > The problem comes when authors of another package, let's call it the
> > "FooBar" package, want to implement an algorithm in Scala based on
> > functionality provided by the rscala package.  Let's say they write a
> > function called "neatAlgorithm" based on Scala.  Yes, the FooBar
> > package author could require that, before the user calls the
> > "neatAlgorithm" function, they first call a function to set up the
> > connection (which itself would call the "rscala::scala" function) and
> > then, after calling the "neatAlgorithm" function, they call a function
> > to close the connection.
> >
> > But that is not very user friendly and exposes the user to
> > implementation details of the algorithm.  The user of the FooBar
> > package don't really care whether the "neatAlgorithm" is implemented
> > in pure R, C++, Scala, or whatever, much like the users of the 'lm'
> > function don't need to know the implementation details or do any setup
> > before and after calling the function.
> >
> > The current approach is that the connection to Scala is transparent to
> > the end user of a package.  Behind the scenes, the package author
> > establish the connection once it is needed and the rscala package
> > manages the connection and explicitly closes it when 1. the package is
> > unloaded or 2. the R session ends.  This approach does not leave
> > dangling connections  --- which I believe is the point of the new test
> > --- yet my package is caught up in the test.
> >
> > I hope that this approach is still valid.  Perhaps the test could
> > result in a warning (instead of an error) and CRAN could accept
> > packages with such a warning.
> >
> > If not, a work-around is to have a \dontshow section in the examples
> > that will close the connection (but leave the Scala process running)
> > and then automatically reestablish the connection as needed.  This
> > would not be very efficient but, as Duncan mentioned, it mostly only
> > effects the package examples themselves.  Plus, it would not be too
> > burdensome for package developers.
> >
> > Again, thanks for considering my situation.
> >
> > Best regards,
> >
> > -- David
> >
> > On Mon, Aug 20, 2018 at 11:11 PM Uwe Ligges
> > <ligges using statistik.tu-dortmund.de> wrote:
> > >
> > > My advise:
> > >
> > > Apparently you want to have communication via sockets to scala.
> > >
> > > So there should be a function that creates the connction and one tha
> > > closes the connection.
> > > Comparable to starting some parallel cluster and stopping it again.
> > >
> > > In the meantime, you can allow for all sorts of communication.
> > >
> > > So that's fine.
> > >
> > > Then in your examples, simply design them to be standalone, i.e. in
> > > *your* examples always start the connection and stop it again at the end
> > > of one examples block, i.e. the exampels defined in one Rd file.
> > >
> > > Best,
> > > Uwe Ligges
> > >
> > >
> > >
> > > On 20.08.2018 02:11, Duncan Murdoch wrote:
> > > > On 19/08/2018 12:34 PM, Gábor Csárdi wrote:
> > > >> Sorry, missed that these were examples, so, yeah, that's harder.  G.
> > > >
> > > > How about a function that checks if the connection is open before doing
> > > > anything, and then at the end you close it if it wasn't already open?
> > > > This will make all examples run slower on CRAN, but won't affect most
> > > > users who are doing their own stuff as well as running examples.
> > > >
> > > > Or, how about the startup code for the package opens the connection?
> > > >
> > > > Or perhaps CRAN will respond to this thread with another suggestion.
> > > >
> > > > Duncan Murdoch
> > > >
> > > >
> > > >> On Sun, Aug 19, 2018 at 6:32 PM Gábor Csárdi <csardi.gabor using gmail.com>
> > > >> wrote:
> > > >>>
> > > >>> You could just create a function to close the connection and then
> > > >>> people could call it at the end of their test suites. >>
> > > >>> Gabor
> > > >>> On Sun, Aug 19, 2018 at 6:22 PM David B. Dahl <dahl using stat.byu.edu> wrote:
> > > >>>>
> > > >>>> In preparing to submit an update of my package to CRAN, I found that
> > > >>>> R-devel has a new test regarding "connects left open" that my packages
> > > >>>> fail.  The new test appears to have been committed by Uwe Ligges in
> > > >>>> revisions 74959 and 74964 on 2018-07-14 and 2018-07-15, respectively.
> > > >>>> The commit message says, "check after each example whether open
> > > >>>> connections exist, indicating e.g. file connections were left open or
> > > >>>> parallel clusters still running."
> > > >>>>
> > > >>>> I am hoping for advice on how to pass "R CMD check --as-cran".  Or,
> > > >>>> perhaps my situation will prompt a change to the test or, at least,
> > > >>>> having it result in a warning instead of an error.
> > > >>>>
> > > >>>> Below I describe the situation.  My rscala package allows developers
> > > >>>> to write R packages based on Scala (much like rJava and Rcpp for Java
> > > >>>> and C++, respectively).  Scala runs as a separate process and
> > > >>>> interprocess communication is implemented using socket connections.
> > > >>>>
> > > >>>> Suppose a package using rscala has functions that call Scala code.
> > > >>>> (Such packages are 'bamboo', 'sdols', and 'shallot' on CRAN.)  The
> > > >>>> first time a user executes an R function calling down into Scala, a
> > > >>>> socket connect between Scala and R is established.  For the sake of
> > > >>>> low latency, after the call to the function ends, the connection stays
> > > >>>> open until the package is unloaded or the R session ends.  But, this
> > > >>>> approach runs afoul of the new test mentioned above that appears to be
> > > >>>> designed to catch connections that are *accidentally* left open.
> > > >>>>
> > > >>>> I definitely do not want to users of my packages 'bamboo', 'sdols',
> > > >>>> and 'shallot' to have to think about managing connection between Scala
> > > >>>> and R.  That's an implementation detail and uing the package should be
> > > >>>> transparent for the user (who doesn't care about the implementation
> > > >>>> details).
> > > >>>>
> > > >>>> On my end, I see two solutions:  1. I could try to reengineer my
> > > >>>> approach --- establishing a new connection for every single call into
> > > >>>> Scala --- although I am loath to do anything to increase the latency,
> > > >>>> or 2. I could wrap all the examples in \donttest so that CRAN checks
> > > >>>> are passed.
> > > >>>>
> > > >>>> Or, again, perhaps my situation will prompt a reevaluation of the
> > > >>>> test.  Perhaps it could result in a warning (instead of an error) and
> > > >>>> the CRAN maintainers would accept packages with such a warning.
> > > >>>>
> > > >>>> Any advise?  Thanks a lot!
> > > >>>>
> > > >>>> -- David
> > > >>>>
> > > >>>> ______________________________________________
> > > >>>> R-package-devel using r-project.org mailing list
> > > >>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > > >>
> > > >> ______________________________________________
> > > >> R-package-devel using r-project.org mailing list
> > > >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > > >>
> > > >
> > > > ______________________________________________
> > > > R-package-devel using r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
> > ______________________________________________
> > R-package-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list