[R-pkg-devel] New test in R-devel causes existing packages to fail: "Error: connections left open"

David B. Dahl d@hl @ending from @t@t@byu@edu
Tue Aug 28 22:19:00 CEST 2018


Okay, Uwe, I will close the connection between examples.  Thanks Henrik,
Duncan, Gabor, and Uwe for participating in the discussion.

-- David

On Mon, Aug 27, 2018 at 4:47 PM Uwe Ligges <ligges using statistik.tu-dortmund.de>
wrote:
>
> I still do not undertsand why you cannot stop scala and related
> connections at the end of each example. You can insert a comment that
> this is not needed if you have follow up tasks for scala.
>
> Best,
> Uwe
>
> On 27.08.2018 07:14, David B. Dahl wrote:
> > Henrik,
> >
> > Thanks for the suggest.  Yes, definitely, I think your more nuanced
> > test would be a big improvement.  The only wrinkle is that the
> > connection is established *not* when the package is *loaded* but
> > rather when the connection is *first needed* (using delayedAssign when
> > the package is loaded).  That way, loading the package doesn't block
> > the REPL for ~5 seconds while Scala and the JVM first start.
> >
> > -- David
> >
> > On Thu, Aug 23, 2018 at 11:19 PM Henrik Bengtsson
> > <henrik.bengtsson using gmail.com> wrote:
> >>
> >> Does R CMD check --as-cran test for newly opened connections or any
> >> open connections?  Could the check for stray connection in
> >> examples/vignettes be:
> >>
> >> 1. Record what connections are open
> >> 2. Attach the package
> >> 3. Record what connections are open
> >> 4. Run the example
> >> 5. Assert that no *new* connections in addition to what's recorded in
> >> Step 3 are open
> >> 6. Unload the package
> >> 7. Assert that no *new* connections in addition to what's recorded in
> >> Step 1 are open
> >>
> >> Step 5 asserts that the code in the example does not leave stray
> >> connections behind, and Step 7 that the package itself does not leave
> >> stray connections behind.
> >>
> >> /Henrik
> >> On Thu, Aug 23, 2018 at 1:25 PM David B. Dahl <dahl using stat.byu.edu>
wrote:
> >>>
> >>> Oops, I accidentally did not "reply-all".... Here is my message:
> >>>
> >>> Thanks Uwe, Duncan, and Gabor for the response, advise, and
flexibility.
> >>>
> >>> Regarding Uwe's suggestion:  "... there should be a function that
> >>> creates the connction and one that closes the connection," I should
> >>> clarify.  The rscala package does just that.  There is a function
> >>> (named "scala") that creates the connection (using delayedAssign) and
> >>> another the closes the function (namely an S3 close method).  The
> >>> examples for the rscala package do this full open/close semantics,
> >>> but...
> >>>
> >>> The problem comes when authors of another package, let's call it the
> >>> "FooBar" package, want to implement an algorithm in Scala based on
> >>> functionality provided by the rscala package.  Let's say they write a
> >>> function called "neatAlgorithm" based on Scala.  Yes, the FooBar
> >>> package author could require that, before the user calls the
> >>> "neatAlgorithm" function, they first call a function to set up the
> >>> connection (which itself would call the "rscala::scala" function) and
> >>> then, after calling the "neatAlgorithm" function, they call a function
> >>> to close the connection.
> >>>
> >>> But that is not very user friendly and exposes the user to
> >>> implementation details of the algorithm.  The user of the FooBar
> >>> package don't really care whether the "neatAlgorithm" is implemented
> >>> in pure R, C++, Scala, or whatever, much like the users of the 'lm'
> >>> function don't need to know the implementation details or do any setup
> >>> before and after calling the function.
> >>>
> >>> The current approach is that the connection to Scala is transparent to
> >>> the end user of a package.  Behind the scenes, the package author
> >>> establish the connection once it is needed and the rscala package
> >>> manages the connection and explicitly closes it when 1. the package is
> >>> unloaded or 2. the R session ends.  This approach does not leave
> >>> dangling connections  --- which I believe is the point of the new test
> >>> --- yet my package is caught up in the test.
> >>>
> >>> I hope that this approach is still valid.  Perhaps the test could
> >>> result in a warning (instead of an error) and CRAN could accept
> >>> packages with such a warning.
> >>>
> >>> If not, a work-around is to have a \dontshow section in the examples
> >>> that will close the connection (but leave the Scala process running)
> >>> and then automatically reestablish the connection as needed.  This
> >>> would not be very efficient but, as Duncan mentioned, it mostly only
> >>> effects the package examples themselves.  Plus, it would not be too
> >>> burdensome for package developers.
> >>>
> >>> Again, thanks for considering my situation.
> >>>
> >>> Best regards,
> >>>
> >>> -- David
> >>>
> >>> On Mon, Aug 20, 2018 at 11:11 PM Uwe Ligges
> >>> <ligges using statistik.tu-dortmund.de> wrote:
> >>>>
> >>>> My advise:
> >>>>
> >>>> Apparently you want to have communication via sockets to scala.
> >>>>
> >>>> So there should be a function that creates the connction and one tha
> >>>> closes the connection.
> >>>> Comparable to starting some parallel cluster and stopping it again.
> >>>>
> >>>> In the meantime, you can allow for all sorts of communication.
> >>>>
> >>>> So that's fine.
> >>>>
> >>>> Then in your examples, simply design them to be standalone, i.e. in
> >>>> *your* examples always start the connection and stop it again at the
end
> >>>> of one examples block, i.e. the exampels defined in one Rd file.
> >>>>
> >>>> Best,
> >>>> Uwe Ligges
> >>>>
> >>>>
> >>>>
> >>>> On 20.08.2018 02:11, Duncan Murdoch wrote:
> >>>>> On 19/08/2018 12:34 PM, Gábor Csárdi wrote:
> >>>>>> Sorry, missed that these were examples, so, yeah, that's harder.
G.
> >>>>>
> >>>>> How about a function that checks if the connection is open before
doing
> >>>>> anything, and then at the end you close it if it wasn't already
open?
> >>>>> This will make all examples run slower on CRAN, but won't affect
most
> >>>>> users who are doing their own stuff as well as running examples.
> >>>>>
> >>>>> Or, how about the startup code for the package opens the connection?
> >>>>>
> >>>>> Or perhaps CRAN will respond to this thread with another suggestion.
> >>>>>
> >>>>> Duncan Murdoch
> >>>>>
> >>>>>
> >>>>>> On Sun, Aug 19, 2018 at 6:32 PM Gábor Csárdi <
csardi.gabor using gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> You could just create a function to close the connection and then
> >>>>>>> people could call it at the end of their test suites. >>
> >>>>>>> Gabor
> >>>>>>> On Sun, Aug 19, 2018 at 6:22 PM David B. Dahl <dahl using stat.byu.edu>
wrote:
> >>>>>>>>
> >>>>>>>> In preparing to submit an update of my package to CRAN, I found
that
> >>>>>>>> R-devel has a new test regarding "connects left open" that my
packages
> >>>>>>>> fail.  The new test appears to have been committed by Uwe Ligges
in
> >>>>>>>> revisions 74959 and 74964 on 2018-07-14 and 2018-07-15,
respectively.
> >>>>>>>> The commit message says, "check after each example whether open
> >>>>>>>> connections exist, indicating e.g. file connections were left
open or
> >>>>>>>> parallel clusters still running."
> >>>>>>>>
> >>>>>>>> I am hoping for advice on how to pass "R CMD check --as-cran".
Or,
> >>>>>>>> perhaps my situation will prompt a change to the test or, at
least,
> >>>>>>>> having it result in a warning instead of an error.
> >>>>>>>>
> >>>>>>>> Below I describe the situation.  My rscala package allows
developers
> >>>>>>>> to write R packages based on Scala (much like rJava and Rcpp for
Java
> >>>>>>>> and C++, respectively).  Scala runs as a separate process and
> >>>>>>>> interprocess communication is implemented using socket
connections.
> >>>>>>>>
> >>>>>>>> Suppose a package using rscala has functions that call Scala
code.
> >>>>>>>> (Such packages are 'bamboo', 'sdols', and 'shallot' on CRAN.)
The
> >>>>>>>> first time a user executes an R function calling down into
Scala, a
> >>>>>>>> socket connect between Scala and R is established.  For the sake
of
> >>>>>>>> low latency, after the call to the function ends, the connection
stays
> >>>>>>>> open until the package is unloaded or the R session ends.  But,
this
> >>>>>>>> approach runs afoul of the new test mentioned above that appears
to be
> >>>>>>>> designed to catch connections that are *accidentally* left open.
> >>>>>>>>
> >>>>>>>> I definitely do not want to users of my packages 'bamboo',
'sdols',
> >>>>>>>> and 'shallot' to have to think about managing connection between
Scala
> >>>>>>>> and R.  That's an implementation detail and uing the package
should be
> >>>>>>>> transparent for the user (who doesn't care about the
implementation
> >>>>>>>> details).
> >>>>>>>>
> >>>>>>>> On my end, I see two solutions:  1. I could try to reengineer my
> >>>>>>>> approach --- establishing a new connection for every single call
into
> >>>>>>>> Scala --- although I am loath to do anything to increase the
latency,
> >>>>>>>> or 2. I could wrap all the examples in \donttest so that CRAN
checks
> >>>>>>>> are passed.
> >>>>>>>>
> >>>>>>>> Or, again, perhaps my situation will prompt a reevaluation of the
> >>>>>>>> test.  Perhaps it could result in a warning (instead of an
error) and
> >>>>>>>> the CRAN maintainers would accept packages with such a warning.
> >>>>>>>>
> >>>>>>>> Any advise?  Thanks a lot!
> >>>>>>>>
> >>>>>>>> -- David
> >>>>>>>>
> >>>>>>>> ______________________________________________
> >>>>>>>> R-package-devel using r-project.org mailing list
> >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >>>>>>
> >>>>>> ______________________________________________
> >>>>>> R-package-devel using r-project.org mailing list
> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >>>>>>
> >>>>>
> >>>>> ______________________________________________
> >>>>> R-package-devel using r-project.org mailing list
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >>>
> >>> ______________________________________________
> >>> R-package-devel using r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-package-devel

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list