[Rd] conflicted: an alternative conflict resolution strategy
Joris Meys
jori@mey@ @ending from gm@il@com
Fri Aug 24 11:28:28 CEST 2018
Dear Hadley,
There's been some mails from you lately about packages on R-devel. I would
argue that the appropriate list for that is R-pkg-devel, as I've been told
myself not too long ago. People might get confused and think this is about
a change to R itself, which it obviously is not.
Kind regards
Joris
On Thu, Aug 23, 2018 at 8:32 PM Hadley Wickham <h.wickham using gmail.com> wrote:
> Hi all,
>
> I’d love to get your feedback on the conflicted package, which provides an
> alternative strategy for resolving ambiugous function names (i.e. when
> multiple packages provide identically named functions). conflicted 0.1.0
> is already on CRAN, but I’m currently preparing a revision
> (<https://github.com/r-lib/conflicted>), and looking for feedback.
>
> As you are no doubt aware, R’s default approach means that the most
> recently loaded package “wins” any conflicts. You do get a message about
> conflicts on load, but I see a lot newer R users experiencing problems
> caused by function conflicts. I think there are three primary reasons:
>
> - People don’t read messages about conflicts. Even if you are
> conscientious and do read the messages, it’s hard to notice a single
> new conflict caused by a package upgrade.
>
> - The warning and the problem may be quite far apart. If you load all
> your packages at the top of the script, it may potentially be 100s
> of lines before you encounter a conflict.
>
> - The error messages caused by conflicts are cryptic because you end
> up calling a function with utterly unexpected arguments.
>
> For these reasons, conflicted takes an alternative approach, forcing the
> user to explicitly disambiguate any conflicts:
>
> library(conflicted)
> library(dplyr)
> library(MASS)
>
> select
> #> Error: [conflicted] `select` found in 2 packages.
> #> Either pick the one you want with `::`
> #> * MASS::select
> #> * dplyr::select
> #> Or declare a preference with `conflicted_prefer()`
> #> * conflict_prefer("select", "MASS")
> #> * conflict_prefer("select", "dplyr")
>
> conflicted works by attaching a new “conflicted” environment just after
> the global environment. This environment contains an active binding for
> any ambiguous bindings. The conflicted environment also contains
> bindings for `library()` and `require()` that rebuild the conflicted
> environemnt suppress default reporting (but are otherwise thin wrapeprs
> around the base equivalents).
>
> conflicted also provides a `conflict_scout()` helper which you can use
> to see what’s going on:
>
> conflict_scout(c("dplyr", "MASS"))
> #> 1 conflict:
> #> * `select`: dplyr, MASS
>
> conflicted applies a few heuristics to minimise false positives (at the
> cost of introducing a few false negatives). The overarching goal is to
> ensure that code behaves identically regardless of the order in which
> packages are attached.
>
> - A number of packages provide a function that appears to conflict
> with a function in a base package, but they follow the superset
> principle (i.e. they only extend the API, as explained to me by
> Hervè Pages).
>
> conflicted assumes that packages adhere to the superset principle,
> which appears to be true in most of the cases that I’ve seen. For
> example, the lubridate package provides `as.difftime()` and `date()`
> which extend the behaviour of base functions, and provides S4
> generics for the set operators.
>
> conflict_scout(c("lubridate", "base"))
> #> 5 conflicts:
> #> * `as.difftime`: [lubridate]
> #> * `date` : [lubridate]
> #> * `intersect` : [lubridate]
> #> * `setdiff` : [lubridate]
> #> * `union` : [lubridate]
>
> There are two popular functions that don’t adhere to this principle:
> `dplyr::filter()` and `dplyr::lag()` :(. conflicted handles these
> special cases so they correctly generate conflicts. (I sure wish I’d
> know about the subset principle when creating dplyr!)
>
> conflict_scout(c("dplyr", "stats"))
> #> 2 conflicts:
> #> * `filter`: dplyr, stats
> #> * `lag` : dplyr, stats
>
> - Deprecated functions should never win a conflict, so conflicted
> checks for use of `.Deprecated()`. This rule is very useful when
> moving functions from one package to another. For example, many
> devtools functions were moved to usethis, and conflicted ensures
> that you always get the non-deprecated version, regardess of package
> attach order:
>
> head(conflict_scout(c("devtools", "usethis")))
> #> 26 conflicts:
> #> * `use_appveyor` : [usethis]
> #> * `use_build_ignore` : [usethis]
> #> * `use_code_of_conduct`: [usethis]
> #> * `use_coverage` : [usethis]
> #> * `use_cran_badge` : [usethis]
> #> * `use_cran_comments` : [usethis]
> #> ...
>
> Finally, as mentioned above, the user can declare preferences:
>
> conflict_prefer("select", "MASS")
> #> [conflicted] Will prefer MASS::select over any other package
> conflict_scout(c("dplyr", "MASS"))
> #> 1 conflict:
> #> * `select`: [MASS]
>
> I’d love to hear what people think about the general idea, and if there
> are any obviously missing pieces.
>
> Thanks!
>
> Hadley
>
>
> --
> http://hadley.nz
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Joris Meys
Statistical consultant
Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
-----------
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
[[alternative HTML version deleted]]
More information about the R-devel
mailing list