[R-pkg-devel] New test in R-devel causes existing packages to fail: "Error: connections left open"

David B. Dahl d@hl @ending from @t@t@byu@edu
Thu Aug 23 22:24:02 CEST 2018


Oops, I accidentally did not "reply-all".... Here is my message:

Thanks Uwe, Duncan, and Gabor for the response, advise, and flexibility.

Regarding Uwe's suggestion:  "... there should be a function that
creates the connction and one that closes the connection," I should
clarify.  The rscala package does just that.  There is a function
(named "scala") that creates the connection (using delayedAssign) and
another the closes the function (namely an S3 close method).  The
examples for the rscala package do this full open/close semantics,
but...

The problem comes when authors of another package, let's call it the
"FooBar" package, want to implement an algorithm in Scala based on
functionality provided by the rscala package.  Let's say they write a
function called "neatAlgorithm" based on Scala.  Yes, the FooBar
package author could require that, before the user calls the
"neatAlgorithm" function, they first call a function to set up the
connection (which itself would call the "rscala::scala" function) and
then, after calling the "neatAlgorithm" function, they call a function
to close the connection.

But that is not very user friendly and exposes the user to
implementation details of the algorithm.  The user of the FooBar
package don't really care whether the "neatAlgorithm" is implemented
in pure R, C++, Scala, or whatever, much like the users of the 'lm'
function don't need to know the implementation details or do any setup
before and after calling the function.

The current approach is that the connection to Scala is transparent to
the end user of a package.  Behind the scenes, the package author
establish the connection once it is needed and the rscala package
manages the connection and explicitly closes it when 1. the package is
unloaded or 2. the R session ends.  This approach does not leave
dangling connections  --- which I believe is the point of the new test
--- yet my package is caught up in the test.

I hope that this approach is still valid.  Perhaps the test could
result in a warning (instead of an error) and CRAN could accept
packages with such a warning.

If not, a work-around is to have a \dontshow section in the examples
that will close the connection (but leave the Scala process running)
and then automatically reestablish the connection as needed.  This
would not be very efficient but, as Duncan mentioned, it mostly only
effects the package examples themselves.  Plus, it would not be too
burdensome for package developers.

Again, thanks for considering my situation.

Best regards,

-- David

On Mon, Aug 20, 2018 at 11:11 PM Uwe Ligges
<ligges using statistik.tu-dortmund.de> wrote:
>
> My advise:
>
> Apparently you want to have communication via sockets to scala.
>
> So there should be a function that creates the connction and one tha
> closes the connection.
> Comparable to starting some parallel cluster and stopping it again.
>
> In the meantime, you can allow for all sorts of communication.
>
> So that's fine.
>
> Then in your examples, simply design them to be standalone, i.e. in
> *your* examples always start the connection and stop it again at the end
> of one examples block, i.e. the exampels defined in one Rd file.
>
> Best,
> Uwe Ligges
>
>
>
> On 20.08.2018 02:11, Duncan Murdoch wrote:
> > On 19/08/2018 12:34 PM, Gábor Csárdi wrote:
> >> Sorry, missed that these were examples, so, yeah, that's harder.  G.
> >
> > How about a function that checks if the connection is open before doing
> > anything, and then at the end you close it if it wasn't already open?
> > This will make all examples run slower on CRAN, but won't affect most
> > users who are doing their own stuff as well as running examples.
> >
> > Or, how about the startup code for the package opens the connection?
> >
> > Or perhaps CRAN will respond to this thread with another suggestion.
> >
> > Duncan Murdoch
> >
> >
> >> On Sun, Aug 19, 2018 at 6:32 PM Gábor Csárdi <csardi.gabor using gmail.com>
> >> wrote:
> >>>
> >>> You could just create a function to close the connection and then
> >>> people could call it at the end of their test suites. >>
> >>> Gabor
> >>> On Sun, Aug 19, 2018 at 6:22 PM David B. Dahl <dahl using stat.byu.edu> wrote:
> >>>>
> >>>> In preparing to submit an update of my package to CRAN, I found that
> >>>> R-devel has a new test regarding "connects left open" that my packages
> >>>> fail.  The new test appears to have been committed by Uwe Ligges in
> >>>> revisions 74959 and 74964 on 2018-07-14 and 2018-07-15, respectively.
> >>>> The commit message says, "check after each example whether open
> >>>> connections exist, indicating e.g. file connections were left open or
> >>>> parallel clusters still running."
> >>>>
> >>>> I am hoping for advice on how to pass "R CMD check --as-cran".  Or,
> >>>> perhaps my situation will prompt a change to the test or, at least,
> >>>> having it result in a warning instead of an error.
> >>>>
> >>>> Below I describe the situation.  My rscala package allows developers
> >>>> to write R packages based on Scala (much like rJava and Rcpp for Java
> >>>> and C++, respectively).  Scala runs as a separate process and
> >>>> interprocess communication is implemented using socket connections.
> >>>>
> >>>> Suppose a package using rscala has functions that call Scala code.
> >>>> (Such packages are 'bamboo', 'sdols', and 'shallot' on CRAN.)  The
> >>>> first time a user executes an R function calling down into Scala, a
> >>>> socket connect between Scala and R is established.  For the sake of
> >>>> low latency, after the call to the function ends, the connection stays
> >>>> open until the package is unloaded or the R session ends.  But, this
> >>>> approach runs afoul of the new test mentioned above that appears to be
> >>>> designed to catch connections that are *accidentally* left open.
> >>>>
> >>>> I definitely do not want to users of my packages 'bamboo', 'sdols',
> >>>> and 'shallot' to have to think about managing connection between Scala
> >>>> and R.  That's an implementation detail and uing the package should be
> >>>> transparent for the user (who doesn't care about the implementation
> >>>> details).
> >>>>
> >>>> On my end, I see two solutions:  1. I could try to reengineer my
> >>>> approach --- establishing a new connection for every single call into
> >>>> Scala --- although I am loath to do anything to increase the latency,
> >>>> or 2. I could wrap all the examples in \donttest so that CRAN checks
> >>>> are passed.
> >>>>
> >>>> Or, again, perhaps my situation will prompt a reevaluation of the
> >>>> test.  Perhaps it could result in a warning (instead of an error) and
> >>>> the CRAN maintainers would accept packages with such a warning.
> >>>>
> >>>> Any advise?  Thanks a lot!
> >>>>
> >>>> -- David
> >>>>
> >>>> ______________________________________________
> >>>> R-package-devel using r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >>
> >> ______________________________________________
> >> R-package-devel using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >>
> >
> > ______________________________________________
> > R-package-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel



More information about the R-package-devel mailing list