On 6 Oct 2019, at 01:50, Gabriel Becker <gabembec...@gmail.com> wrote:
Hi all,
I think there's some nuance here that makes makes me agree partially with
each "side".
The pipe is inarguably extremely popular. Many probably think of it as a
core feature of R, along with the tidyverse that (as was pointed out)
largely surrounds it and drives its popularity. Whether its a good or bad
thing that they think that doesn't change the fact that by my estimation
that Ant is correct that they do. BUT, I don't agree with him that that, by
itself, is a reason to put it in base R in the form that it exists now. For
the current form, there aren't really any major downsides that I see to
having people just use the package version.
Sure it may be a little weird, but it doesn't ever really stop the
people from using it or present a significant barrier. Another major point
is that many (most?) base R functions are not necessarily tooled to be
endomorphic, which in my personal opinion is *largely* the only place that
the pipes are really compelling.
That was for pipes as the exist in package space, though. There is another
way the pipe could go into base R that could not be done in package space
and has the potential to mitigate some pretty serious downsides to the
pipes relating to debugging, which would be to implement them in the parser.
If
iris %>% group_by(Species) %>% summarize(mean_sl = mean(Sepal.Length)) %>%
filter(mean_sl > 5)
were *parsed* as, for example, into
local({
. = group_by(iris, Species)
._tmp2 = summarize(., mean_sl = mean(Sepal.Length))
filter(., mean_sl > 5)
})
Then debuggiing (once you knew that) would be much easier but behavaior
would be the same as it is now. There could even be some sort of
step-through-pipe debugger at that point added as well for additional
convenience.
There is some minor precedent for that type of transformative parsing:
expr = parse(text = "5 -> x")
expr
expression(5 -> x)
expr[[1]]
x <- 5
Though thats a much more minor transformation.
All of that said, I believe Jim Hester (cc'ed) suggested something along
these lines at the RSummit a couple of years ago, and thus far R-core has
not shown much appetite for changing things in the parser.
Without that changing, I'd have to say that my vote, for whatever its
worth, comes down on the side of pipes being fine in packages. A summary of
my reasoning being that it only makes sense for them to go into R itself if
doing so fixes an issue that cna't be fixed with them in package space.
Best,
~G
On Sun, Oct 6, 2019 at 5:26 AM Ant F <antoine.fa...@gmail.com> wrote:
Yes but this exageration precisely misses the point.
Concerning your examples:
* I love fread but I think it makes a lot of subjective choices that are
best associated with a package. I think it
changed a lot with time and can still change, and we have great developers
willing to maintain it and be reactive
regarding feature requests or bug reports
*.group_by() adds a class that works only (or mostly) with tidyverse verbs,
that's very easy to dismiss it as an inclusion in base R.
* summarize is an alternative to aggregate, that would be very confusing to
have both
Now to be fair to your argument we could think of other functions such as
data.table::rleid() which I believe base R misses deeply,
and there is nothing wrong with packaged functions making their way to base
R.
Maybe there's an existing list of criteria for inclusion, in base R but if
not I can make one up for the sake of this discussion :) :
* 1) the functionality should not already exist
* 2) the function should be general enough
* 3) the function should have a large amount of potential of users
* 4) the function should be robust, and not require extensive maintenance
* 5) the function should be stable, we shouldn't expect new features ever 2
months
* 6) the function should have an intuitive interface in the context of the
rest ot base R
I guess 1 and 6 could be held against my proposal, because :
(1) everything can be done without pipes
(6) They are somewhat surprising (though with explicit dots not that much,
and not more surprising than say `bquote()`)
In my opinion the + offset the -.
I wouldn't advise taking magrittr's pipe (providing the license allows so)
for instance, because it makes a lot of design choices and has a complex
behavior, what I propose is 2 lines of code very unlikely to evolve or
require maintenance.
Antoine
PS: I just receive the digest once a day so If you don't "reply all" I can
only react later.
Le sam. 5 oct. 2019 à 19:54, Hugh Marera <hugh.mar...@gmail.com> a écrit :
I exaggerated the comparison for effect. However, it is not very
difficult
to find functions in dplyr or data.table or indeed other packages that
one
may wish to be in base R. Examples, for me, could include
data.table::fread, dplyr::group_by & dplyr::summari[sZ]e combo, etc.
Also,
the "popularity" of magrittr::`%>%` is mostly attributable to the
tidyverse
(an advanced superset of R). Many R users don't even know that they are
installing the magrittr package.
On Sat, Oct 5, 2019 at 6:30 PM Iñaki Ucar <iu...@fedoraproject.org>
wrote:
On Sat, 5 Oct 2019 at 17:15, Hugh Marera <hugh.mar...@gmail.com> wrote:
How is your argument different to, say, "Should dplyr or data.table
be
part of base R as they are the most popular data science packages and
they
are used by a large number of users?"
Two packages with many features, dozens of functions and under heavy
development to fix bugs, add new features and improve performance, vs.
a single operator with a limited and well-defined functionality, and a
reference implementation that hasn't changed in years (but certainly
hackish in a way that probably could only be improved from R itself).
Can't you really spot the difference?
Iñaki
[[alternative HTML version deleted]]
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[[alternative HTML version deleted]]
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel