[R-pkg-devel] Use of `:::` in a package for code run in a parallel cluster

Tue Sep 15 10:27:57 CEST 2020

Henrik,

I completely agree with everything you wrote, but note that the issue at hand is using `:::' in *the same* package, for example when a package needs to access its own internal functions from an outside context, where running on a cluster node set up by the package is one thing I can think of. So there is no API contract to violate, except the one the package makes with itself. Given this, I'm inclined to agree with David: the language provides an obvious way to do this, why write a semantic kludge that is obviously intended only to circumvent the CRAN warning to achieve something that is necessary for the package? Of course, just my €.02 in a thought-provoking discussion!

Cesko

Op 14-09-2020 om 21:42 schreef Henrik Bengtsson:
> Without having read all of the comments already made here, but my
> understanding why ::: is not allowed is because you are reaching into
> the internal API that the package owner does not guarantee will exist
> in the next release.  If you rely on the internal code of another CRAN
> package in your CRAN package, your CRAN package might break without
> your control.  This might release an avalanche of reverse package
> dependencies failing on CRAN.
> 
> The only thing you can safely rely on is the API that is explicitly
> *exported* by an R package.  In order for the maintainer to break that
> API for reverse dependent packages, they need to go through a process
> of deprecating and defuncting what they want to break/remove - a
> process that involves multiple releases and often reaching out to
> package maintainers and asking them to update accordingly.   CRAN runs
> reverse package dependency checks making sure that a package does not
> break its exported API.  If it does, it will not roll out on CRAN.
> So, in that sense CRAN helps uphold the contract of the exported APIs.
> In contrast, a maintainer can do whatever they want whenever they want
> with their internal code/API.
> 
> With more and more packages being infrastructure packages, I think
> there is room for "protected" API, which is not exported to avoid
> cluttering up the search path for end-users while it yet provides a
> contract toward package developers relying on it.  There are various
> ways to emulate such protected APIs but we don't have a standard and
> there's a risk that 'R CMD check' fails to detect when the contract is
> broken (resulting in delayed run-time errors on the user end).
> 
> My $.02
> 
> Henrik
> 
> On Mon, Sep 14, 2020 at 12:06 PM David Kepplinger
> <david.kepplinger using gmail.com> wrote:
>>
>> Yes, my view is certainly rigid and I agree that in the cases where the
>> function is actually used directly by the user, exporting it is the correct
>> step.
>>
>> However, it seems some packages actually need to access internal functions
>> from an outside context, but the code that accesses the function is
>> logically contained completely inside the package. In these cases, package
>> maintainers seem to be looking for alternatives to `:::` for the sake of
>> avoiding the R CMD check note. I argue that the work arounds, however,
>> either (a) achieve the exact same result as `:::`, but in a less
>> transparent and likely more error prone way, or (b) unnecessarily making an
>> internal function available to the user.
>>
>> I also agree with the CRAN team that package maintainers need to be made
>> aware of the issue when using `:::` inside their package as it is most
>> likely unnecessary. But the phrasing of the note ("almost never needs to
>> use :::") combined with a lack of transparent guidelines on when it is
>> acceptable leads to maintainers looking for alternatives mimicking the
>> behavior of `:::`. I haven't found any official instructions in Writing R
>> extensions or on the mailing list under what circumstances `:::` is deemed
>> to be acceptable by the CRAN team (I have to admit searching for `:::` in
>> the archives yields so many results I haven't looked at all of them). It's
>> probably impossible to conceive every possible use case for `:::`, but a
>> good start may be to have something in the documentation explicitly
>> mentioning commonly observed patterns where `:::` is not acceptable, and
>> the common exceptions to the rule (if there are any).
>>
>> Maybe this issue is so miniscule and almost never comes up that it's not
>> worth mentioning in the documentation.
>>
>> Best,
>> David
>>
>>
>>
>> On Mon, Sep 14, 2020 at 3:19 AM Georgi Boshnakov <
>> georgi.boshnakov using manchester.ac.uk> wrote:
>>
>>> You may have a case to argue to CRAN that you can get the "almost"
>>> exemption (can't say without details) but your views look overly rigid.
>>>
>>> Exporting an object and marking it as internal is not a "work around",
>>> even less a "dirty trick".
>>> Export makes the object available outside the package's namespace and
>>> makes it clear that this is intentional.
>>> If you can't drop the 'package:::' prefix in your use case, this means
>>> that this is what you actually do (i.e. use those objects outside the
>>> namespace of the package). I would be grateful to CRAN for asking me to
>>> export and hence document this.
>>>
>>>
>>> Georgi Boshnakov
>>>
>>> PS Note that there is no such thing as "public namespace".
>>>
>>>
>>> -----Original Message-----
>>> From: R-package-devel <r-package-devel-bounces using r-project.org> On Behalf
>>> Of David Kepplinger
>>> Sent: 13 September 2020 20:52
>>> To: R Package Devel <r-package-devel using r-project.org>
>>> Subject: [R-pkg-devel] Use of `:::` in a package for code run in a
>>> parallel cluster
>>>
>>> Dear list members,
>>>
>>> I submitted an update for my package and got automatically rejected by the
>>> incoming checks (as expected from my own checks) for using `:::` calls to
>>> access the package's namespace.
>>> "There are ::: calls to the package's namespace in its code. A package
>>> *almost* never needs to use ::: for its own objects:…" (emphasis mine)
>>>
>>> This was a conscious decision on my part as the package runs code on a
>>> user-supplied parallel cluster and I consider cluster-exporting the
>>> required functions a no-go as it would potentially overwrite objects in the
>>> clusters R sessions. The package code does not own the cluster and hence
>>> the R sessions. Therefore overwriting objects could potentially lead to
>>> unintended behaviour which is opaque to the user and difficult to debug.
>>>
>>> Another solution to circumvent the R CMD check note is to export the
>>> functions to the public namespace but mark them as internal. This was also
>>> suggested in another thread on this mailing list (c.f. "Etiquette for
>>> package submissions that do not automatically pass checks?"). I do not
>>> agree with this work-around as the methods are indeed internal and should
>>> never be used by users. Exporting truly internal functions for the sake of
>>> satisfying R CMD check is a bad argument, in particular if there is a
>>> clean, well-documented, solution by using `:::`.
>>>
>>> I argue `:::` is the only clean solution to this problem and no dirty
>>> work-arounds are necessary. This is a prime example of where `:::` is
>>> actually useful and needed inside a package. If the R community disagrees,
>>> I think R CMD check should at least emit a WARNING instead of a NOTE and
>>> elaborate on the problem and accepted work-arounds in "Writing R
>>> extensions". Or keep emitting a NOTE but listing those nebulous reasons
>>> where `:::` would be tolerated inside a package. Having more transparent
>>> criteria for submitting to CRAN would be really helpful to the entire R
>>> community and probably also reduce the traffic on this mailing list.
>>>
>>> Best,
>>> David
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>
>>          [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-package-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> 
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>