Hi all, I’d love to get your feedback on the conflicted package, which provides an alternative strategy for resolving ambiugous function names (i.e. when multiple packages provide identically named functions). conflicted 0.1.0 is already on CRAN, but I’m currently preparing a revision (<https://github.com/r-lib/conflicted>), and looking for feedback.
As you are no doubt aware, R’s default approach means that the most recently loaded package “wins” any conflicts. You do get a message about conflicts on load, but I see a lot newer R users experiencing problems caused by function conflicts. I think there are three primary reasons: - People don’t read messages about conflicts. Even if you are conscientious and do read the messages, it’s hard to notice a single new conflict caused by a package upgrade. - The warning and the problem may be quite far apart. If you load all your packages at the top of the script, it may potentially be 100s of lines before you encounter a conflict. - The error messages caused by conflicts are cryptic because you end up calling a function with utterly unexpected arguments. For these reasons, conflicted takes an alternative approach, forcing the user to explicitly disambiguate any conflicts: library(conflicted) library(dplyr) library(MASS) select #> Error: [conflicted] `select` found in 2 packages. #> Either pick the one you want with `::` #> * MASS::select #> * dplyr::select #> Or declare a preference with `conflicted_prefer()` #> * conflict_prefer("select", "MASS") #> * conflict_prefer("select", "dplyr") conflicted works by attaching a new “conflicted” environment just after the global environment. This environment contains an active binding for any ambiguous bindings. The conflicted environment also contains bindings for `library()` and `require()` that rebuild the conflicted environemnt suppress default reporting (but are otherwise thin wrapeprs around the base equivalents). conflicted also provides a `conflict_scout()` helper which you can use to see what’s going on: conflict_scout(c("dplyr", "MASS")) #> 1 conflict: #> * `select`: dplyr, MASS conflicted applies a few heuristics to minimise false positives (at the cost of introducing a few false negatives). The overarching goal is to ensure that code behaves identically regardless of the order in which packages are attached. - A number of packages provide a function that appears to conflict with a function in a base package, but they follow the superset principle (i.e. they only extend the API, as explained to me by Hervè Pages). conflicted assumes that packages adhere to the superset principle, which appears to be true in most of the cases that I’ve seen. For example, the lubridate package provides `as.difftime()` and `date()` which extend the behaviour of base functions, and provides S4 generics for the set operators. conflict_scout(c("lubridate", "base")) #> 5 conflicts: #> * `as.difftime`: [lubridate] #> * `date` : [lubridate] #> * `intersect` : [lubridate] #> * `setdiff` : [lubridate] #> * `union` : [lubridate] There are two popular functions that don’t adhere to this principle: `dplyr::filter()` and `dplyr::lag()` :(. conflicted handles these special cases so they correctly generate conflicts. (I sure wish I’d know about the subset principle when creating dplyr!) conflict_scout(c("dplyr", "stats")) #> 2 conflicts: #> * `filter`: dplyr, stats #> * `lag` : dplyr, stats - Deprecated functions should never win a conflict, so conflicted checks for use of `.Deprecated()`. This rule is very useful when moving functions from one package to another. For example, many devtools functions were moved to usethis, and conflicted ensures that you always get the non-deprecated version, regardess of package attach order: head(conflict_scout(c("devtools", "usethis"))) #> 26 conflicts: #> * `use_appveyor` : [usethis] #> * `use_build_ignore` : [usethis] #> * `use_code_of_conduct`: [usethis] #> * `use_coverage` : [usethis] #> * `use_cran_badge` : [usethis] #> * `use_cran_comments` : [usethis] #> ... Finally, as mentioned above, the user can declare preferences: conflict_prefer("select", "MASS") #> [conflicted] Will prefer MASS::select over any other package conflict_scout(c("dplyr", "MASS")) #> 1 conflict: #> * `select`: [MASS] I’d love to hear what people think about the general idea, and if there are any obviously missing pieces. Thanks! Hadley -- http://hadley.nz ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel