I'm largely with Gabriel Becker on this one: if pipes enter base R, they should be a well thought out and integrated part of the language.
I do see merit though in providing a pipe in base R. Reason is mainly that right now there's not a single pipe. A pipe function exists in different packages, and it's not impossible that at one point piping operators might behave slightly different depending on the package you load. So I hope someone from RStudio is reading this thread and decides to do the heavy lifting for R core. After all, it really is mainly their packages that would benefit from it. I can't think of a non-tidyverse package that's easier to use with pipes than without. Best Joris On Sun, Oct 6, 2019 at 1:50 AM Gabriel Becker <gabembec...@gmail.com> wrote: > Hi all, > > I think there's some nuance here that makes makes me agree partially with > each "side". > > The pipe is inarguably extremely popular. Many probably think of it as a > core feature of R, along with the tidyverse that (as was pointed out) > largely surrounds it and drives its popularity. Whether its a good or bad > thing that they think that doesn't change the fact that by my estimation > that Ant is correct that they do. BUT, I don't agree with him that that, by > itself, is a reason to put it in base R in the form that it exists now. For > the current form, there aren't really any major downsides that I see to > having people just use the package version. > > Sure it may be a little weird, but it doesn't ever really stop the > people from using it or present a significant barrier. Another major point > is that many (most?) base R functions are not necessarily tooled to be > endomorphic, which in my personal opinion is *largely* the only place that > the pipes are really compelling. > > That was for pipes as the exist in package space, though. There is another > way the pipe could go into base R that could not be done in package space > and has the potential to mitigate some pretty serious downsides to the > pipes relating to debugging, which would be to implement them in the > parser. > > If > > iris %>% group_by(Species) %>% summarize(mean_sl = mean(Sepal.Length)) %>% > filter(mean_sl > 5) > > > were *parsed* as, for example, into > > local({ > . = group_by(iris, Species) > > ._tmp2 = summarize(., mean_sl = mean(Sepal.Length)) > > filter(., mean_sl > 5) > }) > > > > > Then debuggiing (once you knew that) would be much easier but behavaior > would be the same as it is now. There could even be some sort of > step-through-pipe debugger at that point added as well for additional > convenience. > > There is some minor precedent for that type of transformative parsing: > > > expr = parse(text = "5 -> x") > > > expr > > expression(5 -> x) > > > expr[[1]] > > x <- 5 > > > Though thats a much more minor transformation. > > All of that said, I believe Jim Hester (cc'ed) suggested something along > these lines at the RSummit a couple of years ago, and thus far R-core has > not shown much appetite for changing things in the parser. > > Without that changing, I'd have to say that my vote, for whatever its > worth, comes down on the side of pipes being fine in packages. A summary of > my reasoning being that it only makes sense for them to go into R itself if > doing so fixes an issue that cna't be fixed with them in package space. > > Best, > ~G > > > > On Sun, Oct 6, 2019 at 5:26 AM Ant F <antoine.fa...@gmail.com> wrote: > > > Yes but this exageration precisely misses the point. > > > > Concerning your examples: > > > > * I love fread but I think it makes a lot of subjective choices that are > > best associated with a package. I think it > > changed a lot with time and can still change, and we have great > developers > > willing to maintain it and be reactive > > regarding feature requests or bug reports > > > > *.group_by() adds a class that works only (or mostly) with tidyverse > verbs, > > that's very easy to dismiss it as an inclusion in base R. > > > > * summarize is an alternative to aggregate, that would be very confusing > to > > have both > > > > Now to be fair to your argument we could think of other functions such as > > data.table::rleid() which I believe base R misses deeply, > > and there is nothing wrong with packaged functions making their way to > base > > R. > > > > Maybe there's an existing list of criteria for inclusion, in base R but > if > > not I can make one up for the sake of this discussion :) : > > * 1) the functionality should not already exist > > * 2) the function should be general enough > > * 3) the function should have a large amount of potential of users > > * 4) the function should be robust, and not require extensive maintenance > > * 5) the function should be stable, we shouldn't expect new features > ever 2 > > months > > * 6) the function should have an intuitive interface in the context of > the > > rest ot base R > > > > I guess 1 and 6 could be held against my proposal, because : > > (1) everything can be done without pipes > > (6) They are somewhat surprising (though with explicit dots not that > much, > > and not more surprising than say `bquote()`) > > > > In my opinion the + offset the -. > > > > I wouldn't advise taking magrittr's pipe (providing the license allows > so) > > for instance, because it makes a lot of design choices and has a complex > > behavior, what I propose is 2 lines of code very unlikely to evolve or > > require maintenance. > > > > Antoine > > > > PS: I just receive the digest once a day so If you don't "reply all" I > can > > only react later. > > > > Le sam. 5 oct. 2019 à 19:54, Hugh Marera <hugh.mar...@gmail.com> a > écrit : > > > > > I exaggerated the comparison for effect. However, it is not very > > difficult > > > to find functions in dplyr or data.table or indeed other packages that > > one > > > may wish to be in base R. Examples, for me, could include > > > data.table::fread, dplyr::group_by & dplyr::summari[sZ]e combo, etc. > > Also, > > > the "popularity" of magrittr::`%>%` is mostly attributable to the > > tidyverse > > > (an advanced superset of R). Many R users don't even know that they are > > > installing the magrittr package. > > > > > > On Sat, Oct 5, 2019 at 6:30 PM Iñaki Ucar <iu...@fedoraproject.org> > > wrote: > > > > > >> On Sat, 5 Oct 2019 at 17:15, Hugh Marera <hugh.mar...@gmail.com> > wrote: > > >> > > > >> > How is your argument different to, say, "Should dplyr or data.table > > be > > >> > part of base R as they are the most popular data science packages > and > > >> they > > >> > are used by a large number of users?" > > >> > > >> Two packages with many features, dozens of functions and under heavy > > >> development to fix bugs, add new features and improve performance, vs. > > >> a single operator with a limited and well-defined functionality, and a > > >> reference implementation that hasn't changed in years (but certainly > > >> hackish in a way that probably could only be improved from R itself). > > >> > > >> Can't you really spot the difference? > > >> > > >> Iñaki > > >> > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g> ----------- Biowiskundedagen 2018-2019 http://www.biowiskundedagen.ugent.be/ ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel