Re: [Rd] should base R have a piping operator ?

Lionel Henry Mon, 07 Oct 2019 01:23:09 -0700

Hi Gabe,

> There is another way the pipe could go into base R that could not be
> done in package space and has the potential to mitigate some pretty
> serious downsides to the pipes relating to debugging


I assume you're thinking about the large stack trace of the magrittr
pipe? You don't need a parser transformation to solve this problem
though, the pipe could be implemented as a regular function with a
very limited impact on the stack. And if implemented as a SPECIALSXP,
it would be completely invisible. We've been planning to rewrite %>%
to fix the performance and the stack print, it's just low priority.

About the semantics of local evaluation that were proposed in this
thread, I think that wouldn't be right. A native pipe should be
consistent with other control flow constructs like `if` and `for` and
evaluate in the current environment. In that case, the `.` binding, if
any, would be restored to its original value in `on.exit()` (or through
unwind-protection if implemented in C).

Best,
Lionel


> On 6 Oct 2019, at 01:50, Gabriel Becker <gabembec...@gmail.com> wrote:
> 
> Hi all,
> 
> I think there's some nuance here that makes makes me agree partially with
> each "side".
> 
> The pipe is inarguably extremely popular. Many probably think of it as a
> core feature of R, along with the tidyverse that (as was pointed out)
> largely surrounds it and drives its popularity. Whether its a good or bad
> thing that they think that doesn't change the fact that by my estimation
> that Ant is correct that they do. BUT, I don't agree with him that that, by
> itself, is a reason to put it in base R in the form that it exists now. For
> the current form, there aren't really any major downsides that I see to
> having people just use the package version.
> 
> Sure it may be a little weird, but it doesn't ever really stop the
> people from using it or present a significant barrier. Another major point
> is that many (most?) base R functions are not necessarily tooled to be
> endomorphic, which in my personal opinion is *largely* the only place that
> the pipes are really compelling.
> 
> That was for pipes as the exist in package space, though. There is another
> way the pipe could go into base R that could not be done in package space
> and has the potential to mitigate some pretty serious downsides to the
> pipes relating to debugging, which would be to implement them in the parser.
> 
> If
> 
> iris %>% group_by(Species) %>% summarize(mean_sl = mean(Sepal.Length)) %>%
> filter(mean_sl > 5)
> 
> 
> were *parsed* as, for example, into
> 
> local({
>            . = group_by(iris, Species)
> 
>            ._tmp2 = summarize(., mean_sl = mean(Sepal.Length))
> 
>            filter(., mean_sl > 5)
>       })
> 
> 
> 
> 
> Then debuggiing (once you knew that) would be much easier but behavaior
> would be the same as it is now. There could even be some sort of
> step-through-pipe debugger at that point added as well for additional
> convenience.
> 
> There is some minor precedent for that type of transformative parsing:
> 
>> expr = parse(text = "5 -> x")
> 
>> expr
> 
> expression(5 -> x)
> 
>> expr[[1]]
> 
> x <- 5
> 
> 
> Though thats a much more minor transformation.
> 
> All of that said, I believe Jim Hester (cc'ed) suggested something along
> these lines at the RSummit a couple of years ago, and thus far R-core has
> not shown much appetite for changing things in the parser.
> 
> Without that changing, I'd have to say that my vote, for whatever its
> worth, comes down on the side of pipes being fine in packages. A summary of
> my reasoning being that it only makes sense for them to go into R itself if
> doing so fixes an issue that cna't be fixed with them in package space.
> 
> Best,
> ~G
> 
> 
> 
> On Sun, Oct 6, 2019 at 5:26 AM Ant F <antoine.fa...@gmail.com> wrote:
> 
>> Yes but this exageration precisely misses the point.
>> 
>> Concerning your examples:
>> 
>> * I love fread but I think it makes a lot of subjective choices that are
>> best associated with a package. I think it
>> changed a lot with time and can still change, and we have great developers
>> willing to maintain it and be reactive
>> regarding feature requests or bug reports
>> 
>> *.group_by() adds a class that works only (or mostly) with tidyverse verbs,
>> that's very easy to dismiss it as an inclusion in base R.
>> 
>> * summarize is an alternative to aggregate, that would be very confusing to
>> have both
>> 
>> Now to be fair to your argument we could think of other functions such as
>> data.table::rleid() which I believe base R misses deeply,
>> and there is nothing wrong with packaged functions making their way to base
>> R.
>> 
>> Maybe there's an existing list of criteria for inclusion, in base R but if
>> not I can make one up for the sake of this discussion :) :
>> * 1) the functionality should not already exist
>> * 2) the function should be general enough
>> * 3) the function should have a large amount of potential of users
>> * 4) the function should be robust, and not require extensive maintenance
>> * 5) the function should be stable, we shouldn't expect new features ever 2
>> months
>> * 6) the function should have an intuitive interface in the context of the
>> rest ot base R
>> 
>> I guess 1 and 6 could be held against my proposal, because :
>> (1) everything can be done without pipes
>> (6) They are somewhat surprising (though with explicit dots not that much,
>> and not more surprising than say `bquote()`)
>> 
>> In my opinion the + offset the -.
>> 
>> I wouldn't advise taking magrittr's pipe (providing the license allows so)
>> for instance, because it makes a lot of design choices and has a complex
>> behavior, what I propose is 2 lines of code very unlikely to evolve or
>> require maintenance.
>> 
>> Antoine
>> 
>> PS: I just receive the digest once a day so If you don't "reply all" I can
>> only react later.
>> 
>> Le sam. 5 oct. 2019 à 19:54, Hugh Marera <hugh.mar...@gmail.com> a écrit :
>> 
>>> I exaggerated the comparison for effect. However, it is not very
>> difficult
>>> to find functions in dplyr or data.table or indeed other packages that
>> one
>>> may wish to be in base R. Examples, for me, could include
>>> data.table::fread, dplyr::group_by & dplyr::summari[sZ]e combo, etc.
>> Also,
>>> the "popularity" of magrittr::`%>%` is mostly attributable to the
>> tidyverse
>>> (an advanced superset of R). Many R users don't even know that they are
>>> installing the magrittr package.
>>> 
>>> On Sat, Oct 5, 2019 at 6:30 PM Iñaki Ucar <iu...@fedoraproject.org>
>> wrote:
>>> 
>>>> On Sat, 5 Oct 2019 at 17:15, Hugh Marera <hugh.mar...@gmail.com> wrote:
>>>>> 
>>>>> How is your argument different to, say,  "Should dplyr or data.table
>> be
>>>>> part of base R as they are the most popular data science packages and
>>>> they
>>>>> are used by a large number of users?"
>>>> 
>>>> Two packages with many features, dozens of functions and under heavy
>>>> development to fix bugs, add new features and improve performance, vs.
>>>> a single operator with a limited and well-defined functionality, and a
>>>> reference implementation that hasn't changed in years (but certainly
>>>> hackish in a way that probably could only be improved from R itself).
>>>> 
>>>> Can't you really spot the difference?
>>>> 
>>>> Iñaki
>>>> 
>>> 
>> 
>>        [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] should base R have a piping operator ?

Reply via email to