[R-pkg-devel] Use of `:::` in a package for code run in a parallel cluster
David Kepplinger
d@v|d@kepp||nger @end|ng |rom gm@||@com
Mon Sep 14 02:00:46 CEST 2020
Thank you all for the discussion and suggestions.
so making a package function baz available makes all functions in the
> package available -- a function in the package already has access to other
> functions in the namespace, whether those functions are exported or not, so
> there is no need to use :::.
>
Thanks, Martin. I completely missed that the parallel package serializes
the entire environment of the function, including the package namespace and
so `:::` is indeed unnecessary in my use case. I probably experimented in
the global environment first and extrapolated the observed behaviour to the
package. Sorry for annoying everyone with this.
I also have another use of `:::` for which I am not sure if it's considered
disallowed use of `:::`, so I'm throwing it out there for feedback.
I have one internal function which checks a long list of common arguments
to several other functions, similar to
internal_check_args <- function (sd = 2, type = c("bootstrap",
"theoretical"), ...) {
# check arguments for valid ranges, etc.
return(list(sd = sd, type = match.arg(type))
}
And several functions which use the internal function for argument checking
and such, similar to
exported_foo <- function (x, sd = 2, type = c("bootstrap", "theoretical")) {
args_call <- match.call()
args_call[[1]] <- quote(mypackage:::internal_check_args)
args <- eval.parent(args_call)
}
exported_foo_cv <- function (x, cv_folds = 3, ...) {
args_call <- match.call(expand.dots = TRUE)
args_call[[1]] <- quote(mypackage:::internal_check_args)
args <- eval.parent(args_call)
}
This is modelled after what, e.g., `lm()` does with `model.frame()`, only
that `internal_check_args()` is not exported, hence I use `:::`. There are
other solutions for this type of use of `:::` (probably some considered
cleaner) but again without guidelines on when `:::` is acceptable it's
difficult for package maintainers to know when to use/not use it. From all
the discussions it seems that there is absolutely no acceptable use of
`:::` and work-arounds are always the better alternative.
In light of the other interesting points brought up by discussants, I also
want to honor their time and reply here.
You may argue that you prefer pkg:::foo for some reason: to which I'd
> respond that you are being rude to the CRAN volunteers. I've offered
> two options (one in the previous thread, a different one here), and
> there was a third one in that thread offered by Ivan Krylov. Surely one
> of these is good enough for your needs, and you shouldn't force CRAN to
> handle you specially.
>
I am sorry it came across rude when I tried to solicit arguments for why
the use of `:::` is considered "bad practice", while work-arounds are
considered to be okay. I wouldn't force CRAN to handle my case specially; I
rather wanted to challenge the general "attitude" towards the use of `:::`.
I am sure there is a need to discourage the use of `:::` in packages as
CRAN volunteers probably have seen hundreds of cases where `:::` was abused
(such as mine, as Martin Morgan pointed out).
you can use
>
> get("internal_function", asNamespace("mypackage"))(arg1, arg2)
>
> In fact, if you look at the source code of `:::`, that's exactly how
> it is implemented:
>
That would be a work-around I would have used if necessary. But my general
question remains: why should I reinvent the wheel when R already comes with
`:::`? The only advantage of all the work-arounds I've seen would be to
trick R CMD check that the code is okay, when in fact the same "bad
practice" is practiced.
Best,
David
On Sun, Sep 13, 2020 at 3:04 PM Martin Morgan <mtmorgan.bioc using gmail.com>
wrote:
> At least in the 'parallel' package
>
> library(parallel)
> cl = makePSOCKcluster(2)
>
> and because of the nature of the R language, the entire namespace is
> exported, analogous to
>
> baz <- local({
> foo <- function() 2
> function(...) foo()
> })
>
> so making a package function baz available makes all functions in the
> package available -- a function in the package already has access to other
> functions in the namespace, whether those functions are exported or not, so
> there is no need to use :::.
>
> > parSapply(1:2, baz)
> [1] 2 2
>
> This is in contrast to what one might expect from exploring things on the
> command line, where foo is defined in the global environment and, by
> convention, the global environment is not serialized to the workers
>
> > foo <- function() 1
> > bar <- function(...) foo()
> > parLapply(cl, 1:2, bar)
> Error in checkForRemoteErrors(val) :
> 2 nodes produced errors; first error: could not find function "foo"
>
> Do you really need to use `:::`?
>
> Martin Morgan
>
>
>
> On 9/13/20, 3:52 PM, "R-package-devel on behalf of David Kepplinger" <
> r-package-devel-bounces using r-project.org on behalf of
> david.kepplinger using gmail.com> wrote:
>
> Dear list members,
>
> I submitted an update for my package and got automatically rejected by
> the
> incoming checks (as expected from my own checks) for using `:::` calls
> to
> access the package's namespace.
> "There are ::: calls to the package's namespace in its code. A package
> *almost* never needs to use ::: for its own objects:…" (emphasis mine)
>
> This was a conscious decision on my part as the package runs code on a
> user-supplied parallel cluster and I consider cluster-exporting the
> required functions a no-go as it would potentially overwrite objects
> in the
> clusters R sessions. The package code does not own the cluster and
> hence
> the R sessions. Therefore overwriting objects could potentially lead to
> unintended behaviour which is opaque to the user and difficult to
> debug.
>
> Another solution to circumvent the R CMD check note is to export the
> functions to the public namespace but mark them as internal. This was
> also
> suggested in another thread on this mailing list (c.f. "Etiquette for
> package submissions that do not automatically pass checks?"). I do not
> agree with this work-around as the methods are indeed internal and
> should
> never be used by users. Exporting truly internal functions for the
> sake of
> satisfying R CMD check is a bad argument, in particular if there is a
> clean, well-documented, solution by using `:::`.
>
> I argue `:::` is the only clean solution to this problem and no dirty
> work-arounds are necessary. This is a prime example of where `:::` is
> actually useful and needed inside a package. If the R community
> disagrees,
> I think R CMD check should at least emit a WARNING instead of a NOTE
> and
> elaborate on the problem and accepted work-arounds in "Writing R
> extensions". Or keep emitting a NOTE but listing those nebulous reasons
> where `:::` would be tolerated inside a package. Having more
> transparent
> criteria for submitting to CRAN would be really helpful to the entire R
> community and probably also reduce the traffic on this mailing list.
>
> Best,
> David
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
[[alternative HTML version deleted]]
More information about the R-package-devel
mailing list