Re: [Rd] New pipe operator
On 06/12/2020 8:22 p.m., Bravington, Mark (Data61, Hobart) wrote: Seems like this *could* be a good thing, and thanks to R core for considering it. But, FWIW: - I agree with Gabor G that consistency of "syntax" should be paramount here. Enough problems have been caused by earlier superficially-convenient non-standard features in R. In particular: -- there should not be any discrepancy between an in-place function-definition, and a predefined function attached to a symbol (as per Gabor's point). -- Hence, the ability to say x |> foo ie without parentheses, seems bound to lead to inconsistency, because x |> foo is allowed, x |> base::foo isn't allowed without tricks, but x |> function( y) foo( y) isn't... So, x |> foo is not worth keeping. Parentheses are a price well worth paying. -- it is still inconsistent and confusing to (apparently) invoke a function in some places--- normally--- via 'foo(x)', yet in others--- pipily--- via 'foo()'. Especially if 'foo' already has a default value for its first argument. - I don't see the problem with a placeholder--- doesn't it remove all ambiguity? Sure there needs to be a standard unclashable name and people can argue about what that should be, but the following seems clear and flexible... to me, anyway: thing |> foo( _PIPE_) |> # standard bah( arg1, _PIPE_) |> # multi-arg function _ANON_({ x <- sum( _PIPE_); _PIPE_/x + x/_PIPE_ }) # anon function where '_PIPE_' is the ordained name of the placeholder, and '_ANON_' constructs-and-calls a function with single argument '_PIPE_'. There is just one rule (I think...): each pipe-stage must be a *call* involving the argument '_PIPE_'. I believe there's no ambiguity if the placeholder is *only* allowed in the RHS of a pipe expression. I think the ambiguity arises if you allow the same syntax to be used to generate anonymous functions. We can't use _PIPE_ as the placeholder, because it's a legal name. But we could use _. Then x |> (_ + 1) + mean(_) could expand unambiguously to (function(_) (_ + 1) + mean(_))(x) but (_ + 1) + mean(_) shouldn't be taken to be an anonymous function declaration, otherwise things like mean(_ |> _) do become ambiguous: is the second placeholder the argument to the anon function, or is it the placeholder for the embedded pipe? However, implementing this makes the parser pretty ugly: its handling of _ depends on the outer context. I now agree that leaving out placeholder syntax was the right decision. - The proposed anonymous-function syntax looks quite ugly to me, diminishing readability and inviting errors. The new pipe symbol |> already looks scarily like quantum mechanics; adding \( just puts fishbones into the symbolic soup. - IMO it's not worth going too far to try to lure magritter-etc fans to swap to the new; my experience is that many people keep using older inferior R syntax for years after better replacements become available (even if they are aware of replacements), for various reasons. Just provide a good framework, and let nature take its course. - Disclaimer: personally I'm not much of a pipehead anyway, so maybe I'm not the audience. But if I was to consider piping, I wouldn't be very tempted by the current proposal. OTOH, I might even be tempted to write--- and use!--- my own version of '%|>%' as above (maybe someone already has). And if R did it for me, that'd be great :) Yours would suffer one of the same problems as magrittr's: it has the wrong operator precedence. The current precedence ordering (from ?Syntax) is, from highest to lowest: :: ::: access variables in a namespace $ @ component / slot extraction [ [[indexing ^ exponentiation (right to left) - + unary minus and plus : sequence operator %any% special operators (including %% and %/%) * / multiply, divide + - (binary) add, subtract < > <= >= == != ordering and comparison ! negation & &&and | ||or ~ as in formulae -> ->> rightwards assignment <- <<- assignment (right to left) = assignment (right to left) ? help (unary and binary) The %>% operator has higher precedence than the arithmetic operators, so x*y %>% f() is equivalent to x*f(y), not f(x*y) as it should "obviously" be. I believe the new |> operator falls between "| ||" and "~", so x || y |> f() is the same as f(x || y), and x ~ y |> f() is x ~ f(y). There could be arguments about where the new one appears (and there probably have been), but *clearly* magrittr's precedence is wrong, and yours would be too, because they are both fixed at the quite high precedence given to %any%. Duncan Murdoch [*] Definition of _ANON_ could be something like this--- almost certainly won't work as-is, this is just to point out that it could be done in standard R. `_ANON_` <- function( expr) { #1. Construct a function with arg '_PIPE_
Re: [Rd] New pipe operator
On 06/12/2020 9:23 p.m., Gabriel Becker wrote: Hi Gabor, On Sun, Dec 6, 2020 at 3:22 PM Gabor Grothendieck wrote: I understand very well that it is implemented at the syntax level; however, in any case the implementation is irrelevant to the principles. Here a similar example to the one I gave before but this time written out: This works: 3 |> function(x) x + 1 but this does not: foo <- function(x) x + 1 3 |> foo so it breaks the principle of functions being first class objects. foo and its definition are not interchangeable. I understood what you meant as well. The issue is that neither foo nor its definition are being operated on, or even exist within the scope of what |> is defined to do. You are used to magrittr's %>% where arguably what you are saying would be true. But its not here, in my view. Again, I think the issue is that |>, in as much as it "operates" on anything at all (it not being a function, regardless of appearances), operates on call expression objects, NOT on functions, ever. function(x) x *parses to a call expression *as does RHSfun(), while RHSfun does not, it parses to a name, *regardless of whether that symbol will eventually evaluate to a closure or not.* So in fact, it seems to me that, technically, all name symbols are being treated exactly the same (none are allowed, including those which will lookup to functions during evaluation), while all* call expressions are also being treated the same. And again, there are no functions anywhere in either case. I agree it's all about call expressions, but they aren't all being treated equally: x |> f(...) expands to f(x, ...), while x |> `function`(...) expands to `function`(...)(x). This is an exception to the rule for other calls, but I think it's a justified one. Duncan Murdoch * except those that include that the parser flags as syntactically special. You have to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)(). I think you should probably be careful what you wish for here. I'm not involved with this work and do not speak for any of those who were, but the principled way to make that consistent while remaining entirely in the parser seems very likely to be to require the latter, rather than not require the former. This isn't just a matter of notation, i.e. foo vs foo(), but is a matter of breaking the way R works as a functional language with first class functions. I don't agree. Consider `+` Having foo <- get("+") ## note no `` here foo(x,y) parse and work correctly while +(x,y) does not does not mean + isn't a function or that it is a "second class citizen", it simply means that the parser has constraints on the syntax for writing code that calls it that calling other functions are not subject to. The fact that such *syntactic* constraints can exist proves that there is not some overarching inviolable principle being violated here, I think. Now you may say "well thats just the parser, it has to parse + specially because its an operator with specific precedence etc". Well, the same exact thing is true of |> I think. Best, ~G On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker wrote: Hi Gabor, On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck < ggrothendi...@gmail.com> wrote: I think the real issue here is that functions are supposed to be first class objects in R or are supposed to be and |> would break that if if is possible to write function(x) x + 1 on the RHS but not foo (assuming foo was defined as that function). I don't think getting experience with using it can change that inconsistency which seems serious to me and needs to be addressed even if it complicates the implementation since it drives to the heart of what R is. With respect I think this is a misunderstanding of what is happening here. Functions are first class citizens. |> is, for all intents and purposes, a macro. LHS |> RHS(arg2=5) parses to RHS(LHS, arg2 = 5) There are no functions at the point in time when the pipe transformation happens, because no code has been evaluated. To know if a symbol is going to evaluate to a function requires evaluation which is a step entirely after the one where the |> pipe is implemented. Another way to think about it is that LHS |> RHS(arg2 = 5) is another way of writing RHS(LHS, arg2 = 5), NOT R code that is (or even can be) evaluated. Now this is a subtle point that only really has implications in as much as it is not the case for magrittr pipes, but its relevant for discussions like this, I think. ~G On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck wrote: The construct utils::head is not that common but bare functions are very common and to make it harder to use the common case so that the uncommon case is slightly easier is not desirable. Also it is trivial to write this which does work: mtcars %>% (utils::head) On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage < hugh.parson...@gmail.com> wrote: I'm surpris
Re: [Rd] New pipe operator
On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch wrote: > I agree it's all about call expressions, but they aren't all being > treated equally: > > x |> f(...) > > expands to f(x, ...), while > > x |> `function`(...) > > expands to `function`(...)(x). This is an exception to the rule for > other calls, but I think it's a justified one. This admitted inconsistency is justified by what? No argument has been presented. The justification seems to be implicitly driven by implementation concerns at the expense of usability and language consistency. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] Re: New pipe operator
On Sat, Dec 5, 2020 at 1:19 PM wrote: > Let's get some experience Here is my last SO post using dplyr rewritten to use R 4.1 devel. Seems not too bad. Was able to work around the placeholder for gsub by specifying the arg names and used \(...)... elsewhere. This does not address the inconsistency discussed though. I have indented by 2 spaced in case the email wraps around. The objective is to read myfile.csv including columns that contain c(...) and integer(0), parsing and evaluating them. # taken from: # https://stackoverflow.com/questions/65174764/reading-in-a-csv-that-contains-vectors-cx-y-in-r/65175172#65175172 # create input file for testing Lines <- "\"col1\",\"col2\",\"col3\"\n\"a\",1,integer(0)\n\"c\",c(3,4),5\n\"e\",6,7\n" cat(Lines, file = "myfile.csv") # # base R 4.1 (devel) DF <- "myfile.csv" |> readLines() |> gsub(pattern = r'{(c\(.*?\)|integer\(0\))}', replacement = r'{"\1"}') |> \(.) read.csv(text = .) |> \(.) replace(., 2:3, lapply(.[2:3], \(col) lapply(col, \(x) eval(parse(text = x) # # dplyr/magrittr library(dplyr) DF <- "myfile.csv" %>% readLines %>% gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', .) %>% { read.csv(text = .) } %>% mutate(across(2:3, ~ lapply(., function(x) eval(parse(text = x) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] Re: New pipe operator
Or, keeping dplyr but with R-devel pipe and function shorthand: DF <- "myfile.csv" %>% readLines() |> \(.) gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', .) |> \(.) read.csv(text = .) |> mutate(across(2:3, \(col) lapply(col, \(x) eval(parse(text = x) Using named arguments to redirect to the implicit first does work, also in magrittr, but for me at least it is the kind of thing I would probably regret a month later when trying to figure out the code. Best, luke On Mon, 7 Dec 2020, Gabor Grothendieck wrote: On Sat, Dec 5, 2020 at 1:19 PM wrote: Let's get some experience Here is my last SO post using dplyr rewritten to use R 4.1 devel. Seems not too bad. Was able to work around the placeholder for gsub by specifying the arg names and used \(...)... elsewhere. This does not address the inconsistency discussed though. I have indented by 2 spaced in case the email wraps around. The objective is to read myfile.csv including columns that contain c(...) and integer(0), parsing and evaluating them. # taken from: # https://stackoverflow.com/questions/65174764/reading-in-a-csv-that-contains-vectors-cx-y-in-r/65175172#65175172 # create input file for testing Lines <- "\"col1\",\"col2\",\"col3\"\n\"a\",1,integer(0)\n\"c\",c(3,4),5\n\"e\",6,7\n" cat(Lines, file = "myfile.csv") # # base R 4.1 (devel) DF <- "myfile.csv" |> readLines() |> gsub(pattern = r'{(c\(.*?\)|integer\(0\))}', replacement = r'{"\1"}') |> \(.) read.csv(text = .) |> \(.) replace(., 2:3, lapply(.[2:3], \(col) lapply(col, \(x) eval(parse(text = x) # # dplyr/magrittr library(dplyr) DF <- "myfile.csv" %>% readLines %>% gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', .) %>% { read.csv(text = .) } %>% mutate(across(2:3, ~ lapply(., function(x) eval(parse(text = x) -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck wrote: > > On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch > wrote: > > I agree it's all about call expressions, but they aren't all being > > treated equally: > > > > x |> f(...) > > > > expands to f(x, ...), while > > > > x |> `function`(...) > > > > expands to `function`(...)(x). This is an exception to the rule for > > other calls, but I think it's a justified one. > > This admitted inconsistency is justified by what? No argument has been > presented. The justification seems to be implicitly driven by implementation > concerns at the expense of usability and language consistency. Sorry if I have missed something, but is your consistency argument basically that if foo <- function(x) x + 1 then x |> foo x |> function(x) x + 1 should both work the same? Suppose it did. Would you then be OK if x |> foo() no longer worked as it does now, and produced foo()(x) instead of foo(x)? If you are not OK with that and want to retain the current behaviour, what would you want to happen with the following? bar <- function(x) function(n) rnorm(n, mean = x) 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10) 10 |> bar(runif(1)) # currently bar(10, runif(1)) both of which you probably want. But then baz <- bar(runif(1)) 10 |> baz (not currently allowed) will not be the same as what you would want from 10 |> bar(runif(1)) which leads to a different kind of inconsistency, doesn't it? -Deepayan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
One could examine how magrittr works as a reference implementation if there is a question on how something should function. It's in widespread use and seems to work well. On Mon, Dec 7, 2020 at 10:20 AM Deepayan Sarkar wrote: > > On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck > wrote: > > > > On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch > > wrote: > > > I agree it's all about call expressions, but they aren't all being > > > treated equally: > > > > > > x |> f(...) > > > > > > expands to f(x, ...), while > > > > > > x |> `function`(...) > > > > > > expands to `function`(...)(x). This is an exception to the rule for > > > other calls, but I think it's a justified one. > > > > This admitted inconsistency is justified by what? No argument has been > > presented. The justification seems to be implicitly driven by > > implementation > > concerns at the expense of usability and language consistency. > > Sorry if I have missed something, but is your consistency argument > basically that if > > foo <- function(x) x + 1 > > then > > x |> foo > x |> function(x) x + 1 > > should both work the same? Suppose it did. Would you then be OK if > > x |> foo() > > no longer worked as it does now, and produced foo()(x) instead of foo(x)? > > If you are not OK with that and want to retain the current behaviour, > what would you want to happen with the following? > > bar <- function(x) function(n) rnorm(n, mean = x) > > 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10) > 10 |> bar(runif(1)) # currently bar(10, runif(1)) > > both of which you probably want. But then > > baz <- bar(runif(1)) > 10 |> baz > > (not currently allowed) will not be the same as what you would want from > > 10 |> bar(runif(1)) > > which leads to a different kind of inconsistency, doesn't it? > > -Deepayan -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
On Mon, Dec 7, 2020 at 9:23 PM Gabor Grothendieck wrote: > > One could examine how magrittr works as a reference implementation if > there is a question on how something should function. It's in > widespread use and seems to work well. Yes, but it has many inconsistencies (including for the example I gave). Do you want a magrittr clone, or do you want consistency? It's OK to want either, but I don't think you can get both. What we actually end up with is another matter, depending on many other factors. I was just trying to understand your consistency argument. -Deepayan > On Mon, Dec 7, 2020 at 10:20 AM Deepayan Sarkar > wrote: > > > > On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck > > wrote: > > > > > > On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch > > > wrote: > > > > I agree it's all about call expressions, but they aren't all being > > > > treated equally: > > > > > > > > x |> f(...) > > > > > > > > expands to f(x, ...), while > > > > > > > > x |> `function`(...) > > > > > > > > expands to `function`(...)(x). This is an exception to the rule for > > > > other calls, but I think it's a justified one. > > > > > > This admitted inconsistency is justified by what? No argument has been > > > presented. The justification seems to be implicitly driven by > > > implementation > > > concerns at the expense of usability and language consistency. > > > > Sorry if I have missed something, but is your consistency argument > > basically that if > > > > foo <- function(x) x + 1 > > > > then > > > > x |> foo > > x |> function(x) x + 1 > > > > should both work the same? Suppose it did. Would you then be OK if > > > > x |> foo() > > > > no longer worked as it does now, and produced foo()(x) instead of foo(x)? > > > > If you are not OK with that and want to retain the current behaviour, > > what would you want to happen with the following? > > > > bar <- function(x) function(n) rnorm(n, mean = x) > > > > 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10) > > 10 |> bar(runif(1)) # currently bar(10, runif(1)) > > > > both of which you probably want. But then > > > > baz <- bar(runif(1)) > > 10 |> baz > > > > (not currently allowed) will not be the same as what you would want from > > > > 10 |> bar(runif(1)) > > > > which leads to a different kind of inconsistency, doesn't it? > > > > -Deepayan > > > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
Hmm, I feel a bit bad coming late to this, but I think I am beginning to side with those who want "... |> head" to work. And yes, that has to happen at the expense of |> head(). As I think it was Gabor points out, the current structure goes down a nonstandard evaluation route, which may be difficult to explain and departs from usual operator evaluation paradigms by being an odd mix of syntax and semantics. R lets you do these sorts of thing, witness ggplot and tidyverse, but the transparency of the language tends to suffer. It would be neater if it was simply so that the class/type of the object on the right hand side decided what should happen. So we could have a rule that we could have an object, an expression, and possibly an unevaluated call on the RHS. Or maybe a formula, I.e., we could have ... |> head but not ... |> head() because head() does not evaluate to anything useful. Instead, we could have some of these ... |> quote(head()) ... |> expression(head()) ... |> ~ head() ... |> \(_) head(_) possibly also using a placeholder mechanism for the three first ones. I kind of like the idea that the ~ could be equivalent to \(_). (And yes, I am kicking myself a bit for not using ~ in the NSE arguments in subset() and transform()) -pd > On 7 Dec 2020, at 16:20 , Deepayan Sarkar wrote: > > On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck > wrote: >> >> On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch >> wrote: >>> I agree it's all about call expressions, but they aren't all being >>> treated equally: >>> >>> x |> f(...) >>> >>> expands to f(x, ...), while >>> >>> x |> `function`(...) >>> >>> expands to `function`(...)(x). This is an exception to the rule for >>> other calls, but I think it's a justified one. >> >> This admitted inconsistency is justified by what? No argument has been >> presented. The justification seems to be implicitly driven by implementation >> concerns at the expense of usability and language consistency. > > Sorry if I have missed something, but is your consistency argument > basically that if > > foo <- function(x) x + 1 > > then > > x |> foo > x |> function(x) x + 1 > > should both work the same? Suppose it did. Would you then be OK if > > x |> foo() > > no longer worked as it does now, and produced foo()(x) instead of foo(x)? > > If you are not OK with that and want to retain the current behaviour, > what would you want to happen with the following? > > bar <- function(x) function(n) rnorm(n, mean = x) > > 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10) > 10 |> bar(runif(1)) # currently bar(10, runif(1)) > > both of which you probably want. But then > > baz <- bar(runif(1)) > 10 |> baz > > (not currently allowed) will not be the same as what you would want from > > 10 |> bar(runif(1)) > > which leads to a different kind of inconsistency, doesn't it? > > -Deepayan > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] Re: New pipe operator
On Mon, Dec 7, 2020 at 10:11 AM wrote: > Or, keeping dplyr but with R-devel pipe and function shorthand: > > DF <- "myfile.csv" %>% > readLines() |> > \(.) gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', .) |> > \(.) read.csv(text = .) |> > mutate(across(2:3, \(col) lapply(col, \(x) eval(parse(text = x) > > Using named arguments to redirect to the implicit first does work, > also in magrittr, but for me at least it is the kind of thing I would > probably regret a month later when trying to figure out the code. The gsub issue suggests that if one were to start afresh that the arguments to gsub (and many other R functions) should be rearranged. Of course, that is precisely what the tidyverse did. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
On 07/12/2020 11:18 a.m., peter dalgaard wrote: Hmm, I feel a bit bad coming late to this, but I think I am beginning to side with those who want "... |> head" to work. And yes, that has to happen at the expense of |> head(). Just curious, how would you express head(df, 10)? Currently it is df |> head(10) Would I have to write it as df |> function(d) head(d, 10) As I think it was Gabor points out, the current structure goes down a nonstandard evaluation route, which may be difficult to explain and departs from usual operator evaluation paradigms by being an odd mix of syntax and semantics. R lets you do these sorts of thing, witness ggplot and tidyverse, but the transparency of the language tends to suffer. I wouldn't call it non-standard evaluation. There is no function corresponding to |>, so there's no evaluation at all. It is more like the way "x -> y" is parsed as "y <- x", or "if (x) y" is transformed to `if`(x, y). Duncan Murdoch It would be neater if it was simply so that the class/type of the object on the right hand side decided what should happen. So we could have a rule that we could have an object, an expression, and possibly an unevaluated call on the RHS. Or maybe a formula, I.e., we could hav ... |> head but not ... |> head() because head() does not evaluate to anything useful. Instead, we could have some of these ... |> quote(head()) ... |> expression(head()) ... |> ~ head() ... |> \(_) head(_) possibly also using a placeholder mechanism for the three first ones. I kind of like the idea that the ~ could be equivalent to \(_). (And yes, I am kicking myself a bit for not using ~ in the NSE arguments in subset() and transform()) -pd On 7 Dec 2020, at 16:20 , Deepayan Sarkar wrote: On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck wrote: On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch wrote: I agree it's all about call expressions, but they aren't all being treated equally: x |> f(...) expands to f(x, ...), while x |> `function`(...) expands to `function`(...)(x). This is an exception to the rule for other calls, but I think it's a justified one. This admitted inconsistency is justified by what? No argument has been presented. The justification seems to be implicitly driven by implementation concerns at the expense of usability and language consistency. Sorry if I have missed something, but is your consistency argument basically that if foo <- function(x) x + 1 then x |> foo x |> function(x) x + 1 should both work the same? Suppose it did. Would you then be OK if x |> foo() no longer worked as it does now, and produced foo()(x) instead of foo(x)? If you are not OK with that and want to retain the current behaviour, what would you want to happen with the following? bar <- function(x) function(n) rnorm(n, mean = x) 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10) 10 |> bar(runif(1)) # currently bar(10, runif(1)) both of which you probably want. But then baz <- bar(runif(1)) 10 |> baz (not currently allowed) will not be the same as what you would want from 10 |> bar(runif(1)) which leads to a different kind of inconsistency, doesn't it? -Deepayan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] anonymous functions
“The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be helpful in making code containing simple function expressions more readable.” Color me unimpressed. Over the decades I've seen several "who can write the shortest code" threads: in Fortran, in C, in Splus, ... The same old idea that "short" is a synonym for either elegant, readable, or efficient is now being recylced in the tidyverse. The truth is that "short" is actually an antonym for all of these things, at least for anyone else reading the code; or for the original coder 30-60 minutes after the "clever" lines were written. Minimal use of the spacebar and/or the return key isn't usually held up as a goal, but creeps into many practiioner's code as well. People are excited by replacing "function(" with "\("? Really? Are people typing code with their thumbs? I am ambivalent about pipes: I think it is a great concept, but too many of my colleagues think that using pipes = no need for any comments. As time goes on, I find my goal is to make my code less compact and more readable. Every bug fix or new feature in the survival package now adds more lines of comments or other documentation than lines of code. If I have to puzzle out what a line does, what about the poor sod who inherits the maintainance? -- Terry M Therneau, PhD Department of Health Science Research Mayo Clinic thern...@mayo.edu "TERR-ree THUR-noh" __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] Re: New pipe operator
My vote is for the consistency of function calls always having parentheses, including in pipes. Making them optional only saves two keystrokes, but will add yet another inconsistency to confuse or trip folks up. As for the new anonymous function syntax, I would prefer something more human friendly, perhaps provide “fun” as a shortcut for “function”, enabling: DF <- "myfile.csv" %>% readLines() |> fun(x) gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', x) |> fun(x) read.csv(text = x)|> mutate( across(2:3, fun(col) lapply(col, fun(x) eval(parse(text = x)) ) ) ) which seems much easier to read and understand, at the cost of only a few extra characters. -G On Mon, Dec 7, 2020 at 11:21 AM Gabor Grothendieck wrote: > On Mon, Dec 7, 2020 at 10:11 AM wrote: > > Or, keeping dplyr but with R-devel pipe and function shorthand: > > > > DF <- "myfile.csv" %>% > > readLines() |> > > \(.) gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', .) |> > > \(.) read.csv(text = .) |> > > mutate(across(2:3, \(col) lapply(col, \(x) eval(parse(text = x) > > > > Using named arguments to redirect to the implicit first does work, > > also in magrittr, but for me at least it is the kind of thing I would > > probably regret a month later when trying to figure out the code. > > The gsub issue suggests that if one were to start afresh > that the arguments to gsub (and many other R functions) > should be rearranged. Of course, that is precisely what > the tidyverse did. > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- "Whereas true religion and good morals are the only solid foundations of public liberty and happiness . . . it is hereby earnestly recommended to the several States to take the most effectual measures for the encouragement thereof." Continental Congress, 1778 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
> On 7 Dec 2020, at 17:35 , Duncan Murdoch wrote: > > On 07/12/2020 11:18 a.m., peter dalgaard wrote: >> Hmm, >> I feel a bit bad coming late to this, but I think I am beginning to side >> with those who want "... |> head" to work. And yes, that has to happen at >> the expense of |> head(). > > Just curious, how would you express head(df, 10)? Currently it is > > df |> head(10) > > Would I have to write it as > > df |> function(d) head(d, 10) It could be df |> ~ head(_, 10) which in a sense is "yes" to your question. > >> As I think it was Gabor points out, the current structure goes down a >> nonstandard evaluation route, which may be difficult to explain and departs >> from usual operator evaluation paradigms by being an odd mix of syntax and >> semantics. R lets you do these sorts of thing, witness ggplot and tidyverse, >> but the transparency of the language tends to suffer. > > I wouldn't call it non-standard evaluation. There is no function > corresponding to |>, so there's no evaluation at all. It is more like the > way "x -> y" is parsed as "y <- x", or "if (x) y" is transformed to `if`(x, > y). That's a point, but maybe also my point. Currently, the parser is inserting the LHS as the 1st argument of the RHS, right? Things might be simpler if it was more like a simple binop. -pd > Duncan Murdoch > >> It would be neater if it was simply so that the class/type of the object on >> the right hand side decided what should happen. So we could have a rule that >> we could have an object, an expression, and possibly an unevaluated call on >> the RHS. Or maybe a formula, I.e., we could hav >> ... |> head >> but not >> ... |> head() >> because head() does not evaluate to anything useful. Instead, we could have >> some of these >> ... |> quote(head()) >> ... |> expression(head()) >> ... |> ~ head() >> ... |> \(_) head(_) >> possibly also using a placeholder mechanism for the three first ones. I kind >> of like the idea that the ~ could be equivalent to \(_). >> (And yes, I am kicking myself a bit for not using ~ in the NSE arguments in >> subset() and transform()) >> -pd >>> On 7 Dec 2020, at 16:20 , Deepayan Sarkar wrote: >>> >>> On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck >>> wrote: On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch wrote: > I agree it's all about call expressions, but they aren't all being > treated equally: > > x |> f(...) > > expands to f(x, ...), while > > x |> `function`(...) > > expands to `function`(...)(x). This is an exception to the rule for > other calls, but I think it's a justified one. This admitted inconsistency is justified by what? No argument has been presented. The justification seems to be implicitly driven by implementation concerns at the expense of usability and language consistency. >>> >>> Sorry if I have missed something, but is your consistency argument >>> basically that if >>> >>> foo <- function(x) x + 1 >>> >>> then >>> >>> x |> foo >>> x |> function(x) x + 1 >>> >>> should both work the same? Suppose it did. Would you then be OK if >>> >>> x |> foo() >>> >>> no longer worked as it does now, and produced foo()(x) instead of foo(x)? >>> >>> If you are not OK with that and want to retain the current behaviour, >>> what would you want to happen with the following? >>> >>> bar <- function(x) function(n) rnorm(n, mean = x) >>> >>> 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10) >>> 10 |> bar(runif(1)) # currently bar(10, runif(1)) >>> >>> both of which you probably want. But then >>> >>> baz <- bar(runif(1)) >>> 10 |> baz >>> >>> (not currently allowed) will not be the same as what you would want from >>> >>> 10 |> bar(runif(1)) >>> >>> which leads to a different kind of inconsistency, doesn't it? >>> >>> -Deepayan >>> >>> __ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel > -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] anonymous functions
Thanks for expressing this eloquently. I heartily agree. On Mon, Dec 7, 2020 at 12:04 PM Therneau, Terry M., Ph.D. via R-devel < r-devel@r-project.org> wrote: > “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be > helpful in making > code containing simple function expressions more readable.” > > Color me unimpressed. > Over the decades I've seen several "who can write the shortest code" > threads: in Fortran, > in C, in Splus, ... The same old idea that "short" is a synonym for > either elegant, > readable, or efficient is now being recylced in the tidyverse. The truth > is that "short" > is actually an antonym for all of these things, at least for anyone else > reading the code; > or for the original coder 30-60 minutes after the "clever" lines were > written. Minimal > use of the spacebar and/or the return key isn't usually held up as a goal, > but creeps into > many practiioner's code as well. > > People are excited by replacing "function(" with "\("? Really? Are > people typing code > with their thumbs? > I am ambivalent about pipes: I think it is a great concept, but too many > of my colleagues > think that using pipes = no need for any comments. > > As time goes on, I find my goal is to make my code less compact and more > readable. Every > bug fix or new feature in the survival package now adds more lines of > comments or other > documentation than lines of code. If I have to puzzle out what a line > does, what about > the poor sod who inherits the maintainance? > > > -- > Terry M Therneau, PhD > Department of Health Science Research > Mayo Clinic > thern...@mayo.edu > > "TERR-ree THUR-noh" > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- "Whereas true religion and good morals are the only solid foundations of public liberty and happiness . . . it is hereby earnestly recommended to the several States to take the most effectual measures for the encouragement thereof." Continental Congress, 1778 [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] anonymous functions
I don't disagree in principle, but the reality is users want shortcuts and as a result various packages, in particular tidyverse, have been providing them. Mostly based on formulas, mostly with significant issues since formulas weren't designed for this, and mostly incompatible (tidyverse ones are compatible within tidyverse but not with others). And of course none work in sapply or lapply. Providing a shorthand in base may help to improve this. You don't have to use it if you don't want to, and you can establish coding standards that disallow it if you like. Best, luke On Mon, 7 Dec 2020, Therneau, Terry M., Ph.D. via R-devel wrote: “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be helpful in making code containing simple function expressions more readable.” Color me unimpressed. Over the decades I've seen several "who can write the shortest code" threads: in Fortran, in C, in Splus, ... The same old idea that "short" is a synonym for either elegant, readable, or efficient is now being recylced in the tidyverse. The truth is that "short" is actually an antonym for all of these things, at least for anyone else reading the code; or for the original coder 30-60 minutes after the "clever" lines were written. Minimal use of the spacebar and/or the return key isn't usually held up as a goal, but creeps into many practiioner's code as well. People are excited by replacing "function(" with "\("? Really? Are people typing code with their thumbs? I am ambivalent about pipes: I think it is a great concept, but too many of my colleagues think that using pipes = no need for any comments. As time goes on, I find my goal is to make my code less compact and more readable. Every bug fix or new feature in the survival package now adds more lines of comments or other documentation than lines of code. If I have to puzzle out what a line does, what about the poor sod who inherits the maintainance? -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] Re: New pipe operator
On 07/12/2020 12:03 p.m., Gregory Warnes wrote: My vote is for the consistency of function calls always having parentheses, including in pipes. Making them optional only saves two keystrokes, but will add yet another inconsistency to confuse or trip folks up. As for the new anonymous function syntax, I would prefer something more human friendly, perhaps provide “fun” as a shortcut for “function”, enabling: DF <- "myfile.csv" %>% readLines() |> fun(x) gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', x) |> fun(x) read.csv(text = x)|> mutate( across(2:3, fun(col) lapply(col, fun(x) eval(parse(text = x)) ) ) ) which seems much easier to read and understand, at the cost of only a few extra characters. But you didn't "always" include parentheses, you skipped them on the calls to the anonymous functions. I think that's the one place I'd make the exception, so maybe we agree: parens are almost always needed, with the sole exception being anonymous functions. As to using "fun", I think that's a bad idea. I haven't checked, but I wouldn't be too surprised if "fun" has been used thousands of times in CRAN packages as the name of a function. So x |> fun(y) would mean "fun(x, y)", whereas x |> fun(y) y+1 would mean (function(y) y+1)(x). Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
On 07/12/2020 12:09 p.m., Peter Dalgaard wrote: On 7 Dec 2020, at 17:35 , Duncan Murdoch wrote: On 07/12/2020 11:18 a.m., peter dalgaard wrote: Hmm, I feel a bit bad coming late to this, but I think I am beginning to side with those who want "... |> head" to work. And yes, that has to happen at the expense of |> head(). Just curious, how would you express head(df, 10)? Currently it is df |> head(10) Would I have to write it as df |> function(d) head(d, 10) It could be df |> ~ head(_, 10) which in a sense is "yes" to your question. I think that's doing too much weird stuff. I wouldn't want to have to teach it to beginners, whereas I think I could teach "df |> head(10)". That's doing one weird thing, but I'd count about three things I'd consider weird in yours. As I think it was Gabor points out, the current structure goes down a nonstandard evaluation route, which may be difficult to explain and departs from usual operator evaluation paradigms by being an odd mix of syntax and semantics. R lets you do these sorts of thing, witness ggplot and tidyverse, but the transparency of the language tends to suffer. I wouldn't call it non-standard evaluation. There is no function corresponding to |>, so there's no evaluation at all. It is more like the way "x -> y" is parsed as "y <- x", or "if (x) y" is transformed to `if`(x, y). That's a point, but maybe also my point. Currently, the parser is inserting the LHS as the 1st argument of the RHS, right? Things might be simpler if it was more like a simple binop. An advantage of the current implementation is that it's simple and easy to understand. Once you make it a user-modifiable binary operator, things will go kind of nuts. For example, I doubt if there are many users of magrittr's pipe who really understand its subtleties, e.g. the example in Luke's paper where 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2). (And I could add 1 %>% c(c(.), 2, .) and 1 %>% c(c(.), 2, . + 2) to continue the fun.) Duncan Murdoch -pd Duncan Murdoch It would be neater if it was simply so that the class/type of the object on the right hand side decided what should happen. So we could have a rule that we could have an object, an expression, and possibly an unevaluated call on the RHS. Or maybe a formula, I.e., we could hav ... |> head but not ... |> head() because head() does not evaluate to anything useful. Instead, we could have some of these ... |> quote(head()) ... |> expression(head()) ... |> ~ head() ... |> \(_) head(_) possibly also using a placeholder mechanism for the three first ones. I kind of like the idea that the ~ could be equivalent to \(_). (And yes, I am kicking myself a bit for not using ~ in the NSE arguments in subset() and transform()) -pd On 7 Dec 2020, at 16:20 , Deepayan Sarkar wrote: On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck wrote: On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch wrote: I agree it's all about call expressions, but they aren't all being treated equally: x |> f(...) expands to f(x, ...), while x |> `function`(...) expands to `function`(...)(x). This is an exception to the rule for other calls, but I think it's a justified one. This admitted inconsistency is justified by what? No argument has been presented. The justification seems to be implicitly driven by implementation concerns at the expense of usability and language consistency. Sorry if I have missed something, but is your consistency argument basically that if foo <- function(x) x + 1 then x |> foo x |> function(x) x + 1 should both work the same? Suppose it did. Would you then be OK if x |> foo() no longer worked as it does now, and produced foo()(x) instead of foo(x)? If you are not OK with that and want to retain the current behaviour, what would you want to happen with the following? bar <- function(x) function(n) rnorm(n, mean = x) 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10) 10 |> bar(runif(1)) # currently bar(10, runif(1)) both of which you probably want. But then baz <- bar(runif(1)) 10 |> baz (not currently allowed) will not be the same as what you would want from 10 |> bar(runif(1)) which leads to a different kind of inconsistency, doesn't it? -Deepayan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [External] Re: New pipe operator
On Mon, 7 Dec 2020, Peter Dalgaard wrote: On 7 Dec 2020, at 17:35 , Duncan Murdoch wrote: On 07/12/2020 11:18 a.m., peter dalgaard wrote: Hmm, I feel a bit bad coming late to this, but I think I am beginning to side with those who want "... |> head" to work. And yes, that has to happen at the expense of |> head(). Just curious, how would you express head(df, 10)? Currently it is df |> head(10) Would I have to write it as df |> function(d) head(d, 10) It could be df |> ~ head(_, 10) which in a sense is "yes" to your question. As I think it was Gabor points out, the current structure goes down a nonstandard evaluation route, which may be difficult to explain and departs from usual operator evaluation paradigms by being an odd mix of syntax and semantics. R lets you do these sorts of thing, witness ggplot and tidyverse, but the transparency of the language tends to suffer. I wouldn't call it non-standard evaluation. There is no function corresponding to |>, so there's no evaluation at all. It is more like the way "x -> y" is parsed as "y <- x", or "if (x) y" is transformed to `if`(x, y). That's a point, but maybe also my point. Currently, the parser is inserting the LHS as the 1st argument of the RHS, right? Things might be simpler if it was more like a simple binop. It can only be a simple binop if you only allow RHS functions of one argument. Which would require currying along the lines Duncan showed. Something like: `%>>%` <- function(x, f) f(x) C1 <- function(f, ...) function(x) f(x, ...) mtcars %>>% head mtcars %>>% C1(head, 2) mtcars %>>% C1(subset, cyl == 4) %>>% \(d) lm(mpg ~ disp, data = d) This might fly if we lived in a world where most RHS functions take one argument and only a few needed currying. That is the case in many functional languages, but not for R. Making the common case of multiple arguments easy means you have to work at the source level, either in the parser or with some form of NSE. Best, luke -pd Duncan Murdoch It would be neater if it was simply so that the class/type of the object on the right hand side decided what should happen. So we could have a rule that we could have an object, an expression, and possibly an unevaluated call on the RHS. Or maybe a formula, I.e., we could hav ... |> head but not ... |> head() because head() does not evaluate to anything useful. Instead, we could have some of these ... |> quote(head()) ... |> expression(head()) ... |> ~ head() ... |> \(_) head(_) possibly also using a placeholder mechanism for the three first ones. I kind of like the idea that the ~ could be equivalent to \(_). (And yes, I am kicking myself a bit for not using ~ in the NSE arguments in subset() and transform()) -pd On 7 Dec 2020, at 16:20 , Deepayan Sarkar wrote: On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck wrote: On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch wrote: I agree it's all about call expressions, but they aren't all being treated equally: x |> f(...) expands to f(x, ...), while x |> `function`(...) expands to `function`(...)(x). This is an exception to the rule for other calls, but I think it's a justified one. This admitted inconsistency is justified by what? No argument has been presented. The justification seems to be implicitly driven by implementation concerns at the expense of usability and language consistency. Sorry if I have missed something, but is your consistency argument basically that if foo <- function(x) x + 1 then x |> foo x |> function(x) x + 1 should both work the same? Suppose it did. Would you then be OK if x |> foo() no longer worked as it does now, and produced foo()(x) instead of foo(x)? If you are not OK with that and want to retain the current behaviour, what would you want to happen with the following? bar <- function(x) function(n) rnorm(n, mean = x) 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10) 10 |> bar(runif(1)) # currently bar(10, runif(1)) both of which you probably want. But then baz <- bar(runif(1)) 10 |> baz (not currently allowed) will not be the same as what you would want from 10 |> bar(runif(1)) which leads to a different kind of inconsistency, doesn't it? -Deepayan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] anonymous functions
It is easier to understand a function if you can see the entire function body at once on a page or screen and excessive verbosity interferes with that. On Mon, Dec 7, 2020 at 12:04 PM Therneau, Terry M., Ph.D. via R-devel wrote: > > “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be > helpful in making > code containing simple function expressions more readable.” > > Color me unimpressed. > Over the decades I've seen several "who can write the shortest code" threads: > in Fortran, > in C, in Splus, ... The same old idea that "short" is a synonym for either > elegant, > readable, or efficient is now being recylced in the tidyverse. The truth is > that "short" > is actually an antonym for all of these things, at least for anyone else > reading the code; > or for the original coder 30-60 minutes after the "clever" lines were > written. Minimal > use of the spacebar and/or the return key isn't usually held up as a goal, > but creeps into > many practiioner's code as well. > > People are excited by replacing "function(" with "\("? Really? Are people > typing code > with their thumbs? > I am ambivalent about pipes: I think it is a great concept, but too many of > my colleagues > think that using pipes = no need for any comments. > > As time goes on, I find my goal is to make my code less compact and more > readable. Every > bug fix or new feature in the survival package now adds more lines of > comments or other > documentation than lines of code. If I have to puzzle out what a line does, > what about > the poor sod who inherits the maintainance? > > > -- > Terry M Therneau, PhD > Department of Health Science Research > Mayo Clinic > thern...@mayo.edu > > "TERR-ree THUR-noh" > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [R/S-PLUS] [EXTERNAL] Re: [External] anonymous functions
Luke, Mostly an aside. I think that pipes are a good addition, and it is clear that you and other R-core thought through many of the details. Congratulations on what appears to be solid work. I've used Unix since '79, so it is almost guarranteed that I like the basic idiom, and I expect to make use of it. Users who think that pipes -- or any other code -- is so clear that comments are superfluous is no reflection on R core, and also a bit of a hobby horse for me. I am a bit bemused by the flood of change suggestions, before people have had a chance to fully exercise the new code. I'd suggest waiting several months, or a year, before major updates, straight up bugs excepted. The same advice holds when moving into a new house. One experience with the survival package has been that most new ideas have been implemented locally, and we run with them for half a year before submission to CRAN. I've had a few "really great" modifications that, thankfully, were never inflicted on the rest of the R community. Terry T. On 12/7/20 11:26 AM, luke-tier...@uiowa.edu wrote: > I don't disagree in principle, but the reality is users want shortcuts > and as a result various packages, in particular tidyverse, have been > providing them. Mostly based on formulas, mostly with significant > issues since formulas weren't designed for this, and mostly > incompatible (tidyverse ones are compatible within tidyverse but not > with others). And of course none work in sapply or lapply. Providing a > shorthand in base may help to improve this. You don't have to use it > if you don't want to, and you can establish coding standards that > disallow it if you like. > > Best, > > luke > > On Mon, 7 Dec 2020, Therneau, Terry M., Ph.D. via R-devel wrote: > >> “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be >> helpful in >> making code containing simple function expressions more readable.” >> >> Color me unimpressed. >> Over the decades I've seen several "who can write the shortest code" >> threads: in >> Fortran, in C, in Splus, ... The same old idea that "short" is a synonym >> for either >> elegant, readable, or efficient is now being recylced in the tidyverse. >> The truth is >> that "short" is actually an antonym for all of these things, at least for >> anyone else >> reading the code; or for the original coder 30-60 minutes after the "clever" >> lines were >> written. Minimal use of the spacebar and/or the return key isn't usually >> held up as a >> goal, but creeps into many practiioner's code as well. >> >> People are excited by replacing "function(" with "\("? Really? Are people >> typing code >> with their thumbs? >> I am ambivalent about pipes: I think it is a great concept, but too many of >> my >> colleagues think that using pipes = no need for any comments. >> >> As time goes on, I find my goal is to make my code less compact and more >> readable. >> Every bug fix or new feature in the survival package now adds more lines of >> comments or >> other documentation than lines of code. If I have to puzzle out what a line >> does, what >> about the poor sod who inherits the maintainance? >> >> >> > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch wrote: > An advantage of the current implementation is that it's simple and easy > to understand. Once you make it a user-modifiable binary operator, > things will go kind of nuts. > > For example, I doubt if there are many users of magrittr's pipe who > really understand its subtleties, e.g. the example in Luke's paper where > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2). (And > I could add 1 %>% c(c(.), 2, .) and 1 %>% c(c(.), 2, . + 2) to > continue the fun.) The rule is not so complicated. Automatic insertion is done unless you use dot in the top level function or if you surround it with {...}. It really makes sense since if you use gsub(pattern, replacement, .) then surely you don't want automatic insertion and if you surround it with { ... } then you are explicitly telling it not to. Assuming the existence of placeholders a possible simplification would be to NOT do automatic insertion if { ... } is used and to use it otherwise although personally having used it for some time I find the existing rule in magrittr generally does what you want. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
IMHO the use of anonymous functions is a very clean solution to the placeholder problem, and the shorthand lambda syntax makes it much more ergonomic to use. Pipe implementations that crawl the RHS for usages of `.` are going to be more expensive than the alternatives. It is nice that the `|>` operator is effectively the same as a regular R function call, and given the identical semantics could then also be reasoned about the same way regular R function calls are. I also agree usages of the `.` placeholder can make the code more challenging to read, since understanding the behavior of a piped expression then requires scouring the RHS for usages of `.`, which can be challenging in dense code. Piping to an anonymous function makes the intent clear to the reader: the programmer is likely piping to an anonymous function because they care where the argument is used in the call, and so the reader of code should be aware of that. Best, Kevin On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck wrote: > > On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch > wrote: > > An advantage of the current implementation is that it's simple and easy > > to understand. Once you make it a user-modifiable binary operator, > > things will go kind of nuts. > > > > For example, I doubt if there are many users of magrittr's pipe who > > really understand its subtleties, e.g. the example in Luke's paper where > > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2). (And > > I could add 1 %>% c(c(.), 2, .) and 1 %>% c(c(.), 2, . + 2) to > > continue the fun.) > > The rule is not so complicated. Automatic insertion is done unless > you use dot in the top level function or if you surround it with > {...}. It really makes sense since if you use gsub(pattern, > replacement, .) then surely you don't want automatic insertion and if > you surround it with { ... } then you are explicitly telling it not > to. > > Assuming the existence of placeholders a possible simplification would > be to NOT do automatic insertion if { ... } is used and to use it > otherwise although personally having used it for some time I find the > existing rule in magrittr generally does what you want. > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] sequential chained operator thoughts
It has been very enlightening watching the discussion not only about the existing and proposed variations of a data "pipe" operator in R but also cognates in many other languages. So I am throwing out a QUESTION that just asks if the pipeline as done is pretty much what could also be done without the need for an operator using a sort of one-time brac]keted construct where you call a function with a sequence of operations you want performed and just have it handle the in-between parts. I mean something like: return_val <- do_chain_sequence( { initial_data, function1(_VAL_); function2(_VAL_, more_args); function3(args, 2 * _VAL_, more_args); ... function_n(_VAL_) }) The above is not meant to be taken literally. I don't care if the symbol is _VAL_ or you use semi-colon characters between statements. There are many possible variants such as each step being in its own curly braces. The idea is to hand over one or more unevaluated blocks of code. There are such functions in use in R already. And yes, it can be written with explicit BEFORE/AFTER clauses to handle things but those are implementation details and I want to focus on a concept. The point is you can potentially write a function that given such a series of arguments, delays evaluation of them until each is needed or used. About all it might need to do is set the value of something like _VAL_ from the first argument if present and then take the text of each subsequent argument and run it while saving the result back into _VAL_ and at the end, return the last _VAL_. Along the way, of course, the temporary values stored each time in _VAL_ would disappear. Is something like this any improvement over this done by the user: Initial <- whatever Temp1 <- function1(initial) Temp2 <- function2(Temp1, ...) rm(Temp1) ... Well, maybe not much. But it does hide some details and allows you to insert or delete steps without worrying about pesky details like variable names being in the right sequence or not over-riding other things in your namespace. It makes your intent clear. Now obviously being evaluated inside a function is not necessarily the same as remaining in the original environment so having something like this as a built-in running in place might be a better idea. I admit the details of how to get one piece at a time as some unevaluated form and recognize clearly what each piece is takes some careful thought. If you want to automatically throw in a first argument of _VAL_ after the first parenthesis found or inserted in new parens if just the name of a function was presented, or other such manipulations as already seem to happen with the Magritrr pipe where a period is the placeholder, that can be delicate work and also fail for some lines of code. There may be many reasons various versions of this proposal can fail for some cases. But functionally, it would be a way to specify in a linear fashion that a sequence of steps is to be connected with data being passed along as it changes. I can also imagine how this kind of method might allow twists like asking for _VAL_$second or other changes such as sorted(_VAL_) or minmax(_VAL_) that would shrink the sequence. This general idea looks like something that some programming language may already do in some form and functionally and is a bit like the pipe idea, albeit with different overhead. And do note many languages already support this in subtle ways. R has a variable called ".Last.value" that always holds the result of the last statement evaluated. If the template above is used properly, that alone might work, albeit be a bit wordy. But it may be more transient in some cases such as a multi-part statement where it ends up being reset within the statement. I am NOT asking for a new feature in R, or any language. I am just asking if the various pipeline ideas used could be done in a general way like I describe as a sequence where the statements are chained as described and intermediate results are transient. But, yes, some implementations might require some changes to the language to be implemented properly and it might not satisfy people used to thinking a certain way. I end by saying that R is a language that only returns one (sometimes complex) return value. Other languages allow multiple return values and pipelines there might be hard to implement or have weird features that allow various of the returns to be captured or even a more general graph of command sequences rather than just a linear pipeline. My thoughts here are for R alone. And I shudder at what happens if you allow exceptions and other kinds of breaks/returns out of such a sequential grouping in mid-stride. I view most such additions and changes as needing careful thought to make sure they have the functionality most people want, are as
Re: [Rd] New pipe operator
On Mon, Dec 7, 2020 at 2:02 PM Kevin Ushey wrote: > > IMHO the use of anonymous functions is a very clean solution to the > placeholder problem, and the shorthand lambda syntax makes it much > more ergonomic to use. Pipe implementations that crawl the RHS for > usages of `.` are going to be more expensive than the alternatives. It You wouldn't have to crawl the expression. This does it at the syntax level. e <- quote( { gsub("x", "y", .) } ) c(e[[1]], quote(. <- LHS), e[-1]) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [R/S-PLUS] [EXTERNAL] Re: [External] anonymous functions
One advantage of the new pipe operator over magrittr's is that the former works with substitute(). > f <- function(x, xlab=deparse1(substitute(x))) paste(sep="", xlab, ": ", paste(collapse=", ",x)) > 2^(1:4) |> f() [1] "2^(1:4): 2, 4, 8, 16" > 2^(1:4) %>% f() [1] ".: 2, 4, 8, 16" This is because the new one is at the parser level, so f() sees an ordinary function call. > dput(quote(2^(1:4) |> f())) f(2^(1:4)) On Mon, Dec 7, 2020 at 10:35 AM Therneau, Terry M., Ph.D. via R-devel < r-devel@r-project.org> wrote: > Luke, >Mostly an aside. I think that pipes are a good addition, and it is > clear that you and > other R-core thought through many of the details. Congratulations on > what appears to be > solid work. I've used Unix since '79, so it is almost guarranteed that I > like the basic > idiom, and I expect to make use of it. Users who think that pipes -- or > any other code -- > is so clear that comments are superfluous is no reflection on R core, and > also a bit of a > hobby horse for me. > > I am a bit bemused by the flood of change suggestions, before people have > had a chance to > fully exercise the new code. I'd suggest waiting several months, or a > year, before major > updates, straight up bugs excepted. The same advice holds when moving > into a new house. > One experience with the survival package has been that most new ideas > have been > implemented locally, and we run with them for half a year before > submission to CRAN. I've > had a few "really great" modifications that, thankfully, were never > inflicted on the rest > of the R community. > > Terry T. > > On 12/7/20 11:26 AM, luke-tier...@uiowa.edu wrote: > > I don't disagree in principle, but the reality is users want shortcuts > > and as a result various packages, in particular tidyverse, have been > > providing them. Mostly based on formulas, mostly with significant > > issues since formulas weren't designed for this, and mostly > > incompatible (tidyverse ones are compatible within tidyverse but not > > with others). And of course none work in sapply or lapply. Providing a > > shorthand in base may help to improve this. You don't have to use it > > if you don't want to, and you can establish coding standards that > > disallow it if you like. > > > > Best, > > > > luke > > > > On Mon, 7 Dec 2020, Therneau, Terry M., Ph.D. via R-devel wrote: > > > >> “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may > be helpful in > >> making code containing simple function expressions more readable.” > >> > >> Color me unimpressed. > >> Over the decades I've seen several "who can write the shortest code" > threads: in > >> Fortran, in C, in Splus, ... The same old idea that "short" is a > synonym for either > >> elegant, readable, or efficient is now being recylced in the > tidyverse. The truth is > >> that "short" is actually an antonym for all of these things, at least > for anyone else > >> reading the code; or for the original coder 30-60 minutes after the > "clever" lines were > >> written. Minimal use of the spacebar and/or the return key isn't > usually held up as a > >> goal, but creeps into many practiioner's code as well. > >> > >> People are excited by replacing "function(" with "\("? Really? Are > people typing code > >> with their thumbs? > >> I am ambivalent about pipes: I think it is a great concept, but too > many of my > >> colleagues think that using pipes = no need for any comments. > >> > >> As time goes on, I find my goal is to make my code less compact and > more readable. > >> Every bug fix or new feature in the survival package now adds more > lines of comments or > >> other documentation than lines of code. If I have to puzzle out what a > line does, what > >> about the poor sod who inherits the maintainance? > >> > >> > >> > > > > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck wrote: > On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch > wrote: > > An advantage of the current implementation is that it's simple and easy > > to understand. Once you make it a user-modifiable binary operator, > > things will go kind of nuts. > > > > For example, I doubt if there are many users of magrittr's pipe who > > really understand its subtleties, e.g. the example in Luke's paper where > > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2). (And > > I could add 1 %>% c(c(.), 2, .) and 1 %>% c(c(.), 2, . + 2) to > > continue the fun.) > > The rule is not so complicated. Automatic insertion is done unless > you use dot in the top level function or if you surround it with > {...}. It really makes sense since if you use gsub(pattern, > replacement, .) then surely you don't want automatic insertion and if > you surround it with { ... } then you are explicitly telling it not > to. > > This is the point that I believe Duncan is trying to make (and I agree with) though. Consider the question "after piping LHS into RHS, what is the first argument in the resulting call?". For the base pipe, the answer, completely unambiguously, is LHS. Full stop. That is easy to understand. For magrittr the answer is "Well, it depends, let me see your RHS expression, is it wrapped in braces? If not, are you using the placeholder? If you are using the placeholder, where/how are you using it?". That is inherently much more complicated. Yes, you understand how the magrittr pipe behaves, and yes you find it very convenient. Thats great, but neither of those things equate to simplicity. They just mean that you, a very experienced pipe user, carry around the cognitive load necessary to have that understanding. More concretely, the current base pipe is extremely simple, all it does i 1. Figure out RHS exprssion call 1. If RHS is an anonymous function declaration, construct a call to it for a new RHS 2. Insert LHS expression into first argument position of RHS call expression Done. And (1) would be removed if anonymous functions required () after them, which would be consistent, and even simpler, but kind of annoying. I think it is a good compromise which is guaranteed to be safe because anonymous functions are something the parser recognizes. Either way, if that was dropped, what |> does would be *entirely* trivial to understand and explain. With a single sentence. I had the equivalent pseudocode for the magrittr pipe written out here but honestly felt like overkill that came across as mean, so I'll leave that as an exercise to interested readers. ~G > Assuming the existence of placeholders a possible simplification would > be to NOT do automatic insertion if { ... } is used and to use it > otherwise although personally having used it for some time I find the > existing rule in magrittr generally does what you want. > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
On Mon, Dec 7, 2020 at 11:05 AM Kevin Ushey wrote: > IMHO the use of anonymous functions is a very clean solution to the > placeholder problem, and the shorthand lambda syntax makes it much > more ergonomic to use. Pipe implementations that crawl the RHS for > usages of `.` are going to be more expensive than the alternatives. It > is nice that the `|>` operator is effectively the same as a regular R > function call, and given the identical semantics could then also be > reasoned about the same way regular R function calls are. > I agree. That said, one thing that maybe could be done, though I'm not super convinced its needed, is make a "curry-stuffed pipe", where something like LHS |^pipearg^> RHS(arg1 = 5, arg3 = 7) Would parse to RHS(pipearg = LHS, arg1 = 5, arg3 = 7) (Assuming we could get the parser to handle |^bla^> correctly) For argument position issues would be sufficient. For more complicated expressions, e.g., those that would use the placeholder multiple times or inside compound expressions, requiring anonymous functions seems quite reasonable to me. And honestly, while I kind of like it, I'm not sure if that "stuffed pipe" expression (assuming we could get the parser to capture it correctly) reads to me as nicer than the following, anyway. LHS |> \(x) RHS(arg1 = 5, pipearg = x, arg3 = 7) ~G > > I also agree usages of the `.` placeholder can make the code more > challenging to read, since understanding the behavior of a piped > expression then requires scouring the RHS for usages of `.`, which can > be challenging in dense code. Piping to an anonymous function makes > the intent clear to the reader: the programmer is likely piping to an > anonymous function because they care where the argument is used in the > call, and so the reader of code should be aware of that. > > Best, > Kevin > > > > On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck > wrote: > > > > On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch > wrote: > > > An advantage of the current implementation is that it's simple and easy > > > to understand. Once you make it a user-modifiable binary operator, > > > things will go kind of nuts. > > > > > > For example, I doubt if there are many users of magrittr's pipe who > > > really understand its subtleties, e.g. the example in Luke's paper > where > > > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2). (And > > > I could add 1 %>% c(c(.), 2, .) and 1 %>% c(c(.), 2, . + 2) to > > > continue the fun.) > > > > The rule is not so complicated. Automatic insertion is done unless > > you use dot in the top level function or if you surround it with > > {...}. It really makes sense since if you use gsub(pattern, > > replacement, .) then surely you don't want automatic insertion and if > > you surround it with { ... } then you are explicitly telling it not > > to. > > > > Assuming the existence of placeholders a possible simplification would > > be to NOT do automatic insertion if { ... } is used and to use it > > otherwise although personally having used it for some time I find the > > existing rule in magrittr generally does what you want. > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
On 12/7/20 11:09 PM, Gabriel Becker wrote: On Mon, Dec 7, 2020 at 11:05 AM Kevin Ushey wrote: IMHO the use of anonymous functions is a very clean solution to the placeholder problem, and the shorthand lambda syntax makes it much more ergonomic to use. Pipe implementations that crawl the RHS for usages of `.` are going to be more expensive than the alternatives. It is nice that the `|>` operator is effectively the same as a regular R function call, and given the identical semantics could then also be reasoned about the same way regular R function calls are. I agree. That said, one thing that maybe could be done, though I'm not super convinced its needed, is make a "curry-stuffed pipe", where something like LHS |^pipearg^> RHS(arg1 = 5, arg3 = 7) Would parse to RHS(pipearg = LHS, arg1 = 5, arg3 = 7) This gave me the idea that naming the arguments can be used to skip the placeholder issue: "funny" |> sub(pattern = "f", replacement = "b") Of course this breaks if the maintainer changes the order of the function arguments (which is not a nice practice but happens). An option could be to allow for missing argument in the first position, but this might add further undesired complexity, so probably not worth the effort: "funny" |> sub(x =, "f", "b") So basically the parsing rule would be: LHS |> RHS(arg=, ...) -> RHS(arg=LHS, ...) (Assuming we could get the parser to handle |^bla^> correctly) For argument position issues would be sufficient. For more complicated expressions, e.g., those that would use the placeholder multiple times or inside compound expressions, requiring anonymous functions seems quite reasonable to me. And honestly, while I kind of like it, I'm not sure if that "stuffed pipe" expression (assuming we could get the parser to capture it correctly) reads to me as nicer than the following, anyway. LHS |> \(x) RHS(arg1 = 5, pipearg = x, arg3 = 7) ~G I also agree usages of the `.` placeholder can make the code more challenging to read, since understanding the behavior of a piped expression then requires scouring the RHS for usages of `.`, which can be challenging in dense code. Piping to an anonymous function makes the intent clear to the reader: the programmer is likely piping to an anonymous function because they care where the argument is used in the call, and so the reader of code should be aware of that. Best, Kevin On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck wrote: On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch wrote: An advantage of the current implementation is that it's simple and easy to understand. Once you make it a user-modifiable binary operator, things will go kind of nuts. For example, I doubt if there are many users of magrittr's pipe who really understand its subtleties, e.g. the example in Luke's paper where 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2). (And I could add 1 %>% c(c(.), 2, .) and 1 %>% c(c(.), 2, . + 2) to continue the fun.) The rule is not so complicated. Automatic insertion is done unless you use dot in the top level function or if you surround it with {...}. It really makes sense since if you use gsub(pattern, replacement, .) then surely you don't want automatic insertion and if you surround it with { ... } then you are explicitly telling it not to. Assuming the existence of placeholders a possible simplification would be to NOT do automatic insertion if { ... } is used and to use it otherwise although personally having used it for some time I find the existing rule in magrittr generally does what you want. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] anonymous functions
I mostly agree with your comments on anonymous functions. However, I think the main problem is cryptic-ness, rather than succinct-ness. The backslash is a relatively universal symbol within programming languages with C-like (ALGOL-like?) syntax. Where it denotes escape sequences within strings. Using the leading character for escape sequences, to define functions, is like using integers to define floating point numbers: my.integer <- as.integer (2) * pi Arguably, the motive is more to be ultra-succinct than cryptic. But either way, we get syntax which is difficult to read, from a mathematical and statistical perspective. On Tue, Dec 8, 2020 at 6:04 AM Therneau, Terry M., Ph.D. via R-devel wrote: > > “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be > helpful in making > code containing simple function expressions more readable.” > > Color me unimpressed. > Over the decades I've seen several "who can write the shortest code" threads: > in Fortran, > in C, in Splus, ... The same old idea that "short" is a synonym for either > elegant, > readable, or efficient is now being recylced in the tidyverse. The truth is > that "short" > is actually an antonym for all of these things, at least for anyone else > reading the code; > or for the original coder 30-60 minutes after the "clever" lines were > written. Minimal > use of the spacebar and/or the return key isn't usually held up as a goal, > but creeps into > many practiioner's code as well. > > People are excited by replacing "function(" with "\("? Really? Are people > typing code > with their thumbs? > I am ambivalent about pipes: I think it is a great concept, but too many of > my colleagues > think that using pipes = no need for any comments. > > As time goes on, I find my goal is to make my code less compact and more > readable. Every > bug fix or new feature in the survival package now adds more lines of > comments or other > documentation than lines of code. If I have to puzzle out what a line does, > what about > the poor sod who inherits the maintainance? > > > -- > Terry M Therneau, PhD > Department of Health Science Research > Mayo Clinic > thern...@mayo.edu > > "TERR-ree THUR-noh" > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] New pipe operator
Hi Denes, On Mon, Dec 7, 2020 at 2:52 PM Dénes Tóth wrote: > > > This gave me the idea that naming the arguments can be used to skip the > placeholder issue: > > "funny" |> sub(pattern = "f", replacement = "b") > > Of course this breaks if the maintainer changes the order of the > function arguments (which is not a nice practice but happens). > This is true, but only if you are specifying all arguments that appear before the one you want explicitly. In practice that may often be true? But I don't really have a strong intuition about that as a non-pipe user. It would require zero changes to the pipe by the R-core team though, so in that sense it could be a solution in the cases it does work. It does make the code subtler to read though, which is a pretty big downside, imho. > An option could be to allow for missing argument in the first position, > but this might add further undesired complexity, so probably not worth > the effort: > > "funny" |> sub(x =, "f", "b") > > So basically the parsing rule would be: > > LHS |> RHS(arg=, ...) -> RHS(arg=LHS, ...) > The problem here is that its ambiguous, because myfun(x, y=, z) is technically syntactically valid, so this would make code that parses now into valid syntax change its meaning, and would prevent existing, syntactically valid (Though hopefully quite rare) code in the pipe context. ~G > > > > > (Assuming we could get the parser to handle |^bla^> correctly) > > > > For argument position issues would be sufficient. For more complicated > > expressions, e.g., those that would use the placeholder multiple times or > > inside compound expressions, requiring anonymous functions seems quite > > reasonable to me. And honestly, while I kind of like it, I'm not sure if > > that "stuffed pipe" expression (assuming we could get the parser to > capture > > it correctly) reads to me as nicer than the following, anyway. > > > > LHS |> \(x) RHS(arg1 = 5, pipearg = x, arg3 = 7) > > > > ~G > > > >> > >> I also agree usages of the `.` placeholder can make the code more > >> challenging to read, since understanding the behavior of a piped > >> expression then requires scouring the RHS for usages of `.`, which can > >> be challenging in dense code. Piping to an anonymous function makes > >> the intent clear to the reader: the programmer is likely piping to an > >> anonymous function because they care where the argument is used in the > >> call, and so the reader of code should be aware of that. > >> > >> Best, > >> Kevin > >> > >> > >> > >> On Mon, Dec 7, 2020 at 10:35 AM Gabor Grothendieck > >> wrote: > >>> > >>> On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch < > murdoch.dun...@gmail.com> > >> wrote: > An advantage of the current implementation is that it's simple and > easy > to understand. Once you make it a user-modifiable binary operator, > things will go kind of nuts. > > For example, I doubt if there are many users of magrittr's pipe who > really understand its subtleties, e.g. the example in Luke's paper > >> where > 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2). > (And > I could add 1 %>% c(c(.), 2, .) and 1 %>% c(c(.), 2, . + 2) to > continue the fun.) > >>> > >>> The rule is not so complicated. Automatic insertion is done unless > >>> you use dot in the top level function or if you surround it with > >>> {...}. It really makes sense since if you use gsub(pattern, > >>> replacement, .) then surely you don't want automatic insertion and if > >>> you surround it with { ... } then you are explicitly telling it not > >>> to. > >>> > >>> Assuming the existence of placeholders a possible simplification would > >>> be to NOT do automatic insertion if { ... } is used and to use it > >>> otherwise although personally having used it for some time I find the > >>> existing rule in magrittr generally does what you want. > >>> > >>> __ > >>> R-devel@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-devel > >> > >> __ > >> R-devel@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-devel > >> > > > > [[alternative HTML version deleted]] > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] anonymous functions
Sorry, I should replace "cryptic-ness" from my last post, with "unnecessary cryptic-ness". Sometimes short symbolic expressions are necessary. P.S. Often, I wish I could write: f (x) = x^2. But that's replacement function syntax. On Tue, Dec 8, 2020 at 11:56 AM Abby Spurdle wrote: > > I mostly agree with your comments on anonymous functions. > > However, I think the main problem is cryptic-ness, rather than succinct-ness. > The backslash is a relatively universal symbol within programming > languages with C-like (ALGOL-like?) syntax. > Where it denotes escape sequences within strings. > > Using the leading character for escape sequences, to define functions, > is like using integers to define floating point numbers: > > my.integer <- as.integer (2) * pi > > Arguably, the motive is more to be ultra-succinct than cryptic. > But either way, we get syntax which is difficult to read, from a > mathematical and statistical perspective. > > > On Tue, Dec 8, 2020 at 6:04 AM Therneau, Terry M., Ph.D. via R-devel > wrote: > > > > “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be > > helpful in making > > code containing simple function expressions more readable.” > > > > Color me unimpressed. > > Over the decades I've seen several "who can write the shortest code" > > threads: in Fortran, > > in C, in Splus, ... The same old idea that "short" is a synonym for > > either elegant, > > readable, or efficient is now being recylced in the tidyverse. The truth > > is that "short" > > is actually an antonym for all of these things, at least for anyone else > > reading the code; > > or for the original coder 30-60 minutes after the "clever" lines were > > written. Minimal > > use of the spacebar and/or the return key isn't usually held up as a goal, > > but creeps into > > many practiioner's code as well. > > > > People are excited by replacing "function(" with "\("? Really? Are > > people typing code > > with their thumbs? > > I am ambivalent about pipes: I think it is a great concept, but too many of > > my colleagues > > think that using pipes = no need for any comments. > > > > As time goes on, I find my goal is to make my code less compact and more > > readable. Every > > bug fix or new feature in the survival package now adds more lines of > > comments or other > > documentation than lines of code. If I have to puzzle out what a line > > does, what about > > the poor sod who inherits the maintainance? > > > > > > -- > > Terry M Therneau, PhD > > Department of Health Science Research > > Mayo Clinic > > thern...@mayo.edu > > > > "TERR-ree THUR-noh" > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] anonymous functions
I will stick my oar in here as a user to say that I find the \(x) syntax a bit line-noise-ish. David > On 8 Dec 2020, at 00:05, Abby Spurdle wrote: > > Sorry, I should replace "cryptic-ness" from my last post, with > "unnecessary cryptic-ness". > Sometimes short symbolic expressions are necessary. > > > P.S. > Often, I wish I could write: f (x) = x^2. > But that's replacement function syntax. > > >> On Tue, Dec 8, 2020 at 11:56 AM Abby Spurdle wrote: >> >> I mostly agree with your comments on anonymous functions. >> >> However, I think the main problem is cryptic-ness, rather than succinct-ness. >> The backslash is a relatively universal symbol within programming >> languages with C-like (ALGOL-like?) syntax. >> Where it denotes escape sequences within strings. >> >> Using the leading character for escape sequences, to define functions, >> is like using integers to define floating point numbers: >> >>my.integer <- as.integer (2) * pi >> >> Arguably, the motive is more to be ultra-succinct than cryptic. >> But either way, we get syntax which is difficult to read, from a >> mathematical and statistical perspective. >> >> >>> On Tue, Dec 8, 2020 at 6:04 AM Therneau, Terry M., Ph.D. via R-devel >>> wrote: >>> >>> “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be >>> helpful in making >>> code containing simple function expressions more readable.” >>> >>> Color me unimpressed. >>> Over the decades I've seen several "who can write the shortest code" >>> threads: in Fortran, >>> in C, in Splus, ... The same old idea that "short" is a synonym for >>> either elegant, >>> readable, or efficient is now being recylced in the tidyverse. The truth >>> is that "short" >>> is actually an antonym for all of these things, at least for anyone else >>> reading the code; >>> or for the original coder 30-60 minutes after the "clever" lines were >>> written. Minimal >>> use of the spacebar and/or the return key isn't usually held up as a goal, >>> but creeps into >>> many practiioner's code as well. >>> >>> People are excited by replacing "function(" with "\("? Really? Are >>> people typing code >>> with their thumbs? >>> I am ambivalent about pipes: I think it is a great concept, but too many of >>> my colleagues >>> think that using pipes = no need for any comments. >>> >>> As time goes on, I find my goal is to make my code less compact and more >>> readable. Every >>> bug fix or new feature in the survival package now adds more lines of >>> comments or other >>> documentation than lines of code. If I have to puzzle out what a line >>> does, what about >>> the poor sod who inherits the maintainance? >>> >>> >>> -- >>> Terry M Therneau, PhD >>> Department of Health Science Research >>> Mayo Clinic >>> thern...@mayo.edu >>> >>> "TERR-ree THUR-noh" >>> >>> __ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel