Re: [Rd] A few suggestions and perspectives from a PhD student

2017-05-08 Thread Antonin Klima
Thanks for the answers,

I’m aware of the ‘.’ option, just wanted to give a very simple example.

But the lapply ‘…' parameter use has eluded me and thanks for enlightening me. 

What do you mean by messing up the call stack. As far as I understand it, 
piping should translate into same code as deep nesting. So then I only see a 
tiny downside for debugging here. No loss of time/space efficiency or anything. 
With a change of inadvertent error in your example, coming from the fact that a 
variable is being reused and noone now checks for me whether it is being passed 
between the lines. And with having to specify the variable every single time. 
For me, that solution is clearly inferior.

Too bad you didn’t find my other comments interesting though.

>Why do you think being implemented in a contributed package restricts
>the usefulness of a feature?

I guess it depends on your philosophy. It may not restrict it per say, although 
it would make a lot of sense to me reusing the bash-style ‘|' and have a 
shorter, more readable version. One has extra dependence on a package for an 
item that fits the language so well that it should be its part.  It is without 
doubt my most used operator at least. Going to some of my folders I found 101 
uses in 750 lines, and 132 uses in 3303 lines. I would compare it to having a 
computer game being really good with a fan-created mod, but lacking otherwise. 
:) 

So to me, it makes sense that if there is no doubt that a feature improves the 
language, and especially if people extensively use it through a package 
already, it should be part of the “standard”. Question is whether it is indeed 
very popular, and whether you share my view. But that’s now up to you, I just 
wanted to point it out I guess.

Best Regards,
Antonin

> On 05 May 2017, at 22:33, Gabor Grothendieck  wrote:
> 
> Regarding the anonymous-function-in-a-pipeline point one can already
> do this which does use brackets but even so it involves fewer
> characters than the example shown.  Here { . * 2 } is basically a
> lambda whose argument is dot. Would this be sufficient?
> 
>  library(magrittr)
> 
>  1.5 %>% { . * 2 }
>  ## [1] 3
> 
> Regarding currying note that with magrittr Ista's code could be written as:
> 
>  1:5 %>% lapply(foo, y = 3)
> 
> or at the expense of slightly more verbosity:
> 
>  1:5 %>% Map(f = . %>% foo(y = 3))
> 
> 
> On Fri, May 5, 2017 at 1:00 PM, Antonin Klima  wrote:
>> Dear Sir or Madam,
>> 
>> I am in 2nd year of my PhD in bioinformatics, after taking my Master’s in 
>> computer science, and have been using R heavily during my PhD. As such, I 
>> have put together a list of certain features in R that, in my opinion, would 
>> be beneficial to add, or could be improved. The first two are already 
>> implemented in packages, but given that it is implemented as user-defined 
>> operators, it greatly restricts its usefulness. I hope you will find my 
>> suggestions interesting. If you find time, I will welcome any feedback as to 
>> whether you find the suggestions useful, or why you do not think they should 
>> be implemented. I will also welcome if you enlighten me with any features I 
>> might be unaware of, that might solve the issues I have pointed out below.
>> 
>> 1) piping
>> Currently available in package magrittr, piping makes the code better 
>> readable by having the line start at its natural starting point, and 
>> following with functions that are applied - in order. The readability of 
>> several nested calls with a number of parameters each is almost zero, it’s 
>> almost as if one would need to come up with the solution himself. Pipeline 
>> in comparison is very straightforward, especially together with the point 
>> (2).
>> 
>> The package here works rather good nevertheless, the shortcomings of piping 
>> not being native are not quite as severe as in point (2). Nevertheless, an 
>> intuitive symbol such as | would be helpful, and it sometimes bothers me 
>> that I have to parenthesize anonymous function, which would probably not be 
>> required in a native pipe-operator, much like it is not required in f.ex. 
>> lapply. That is,
>> 1:5 %>% function(x) x+2
>> should be totally fine
>> 
>> 2) currying
>> Currently available in package Curry. The idea is that, having a function 
>> such as foo = function(x, y) x+y, one would like to write for example 
>> lapply(foo(3), 1:5), and have the interpreter figure out ok, foo(3) does not 
>> make a value result, but it can still give a function result - a function of 
>> y. This would be indeed most useful for various apply functions, rather than 
>> writing function(x) foo(3,x).
>> 
>> I suggest that currying would make the code easier to write, and more 
>> readable, especially when using apply functions. One might imagine that 
>> there could be some confusion with such a feature, especially from people 
>> unfamiliar with functional programming, although R already does take 
>> function as first-order arguments, so it cou

Re: [Rd] A few suggestions and perspectives from a PhD student

2017-05-08 Thread Ista Zahn
On Mon, May 8, 2017 at 8:08 AM, Antonin Klima  wrote:
> Thanks for the answers,
>
> I’m aware of the ‘.’ option, just wanted to give a very simple example.
>
> But the lapply ‘…' parameter use has eluded me and thanks for enlightening me.
>
> What do you mean by messing up the call stack. As far as I understand it, 
> piping should translate into same code as deep nesting.

Perhaps, but then magrittr is not really a pipe. Here is a simple example

library(magrittr)
data.frame(x = 1) %>%
subset(y == 1)
traceback()

> Error in eval(e, x, parent.frame()) : object 'y' not found
> 12: eval(e, x, parent.frame())
11: eval(e, x, parent.frame())
10: subset.data.frame(., y == 1)
9: subset(., y == 1)
8: function_list[[k]](value)
7: withVisible(function_list[[k]](value))
6: freduce(value, `_function_list`)
5: `_fseq`(`_lhs`)
4: eval(quote(`_fseq`(`_lhs`)), env, env)
3: eval(quote(`_fseq`(`_lhs`)), env, env)
2: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
1: data.frame(x = 1) %>% subset(y == 1)
>

subset(data.frame(x = 1),
  y == 1)
traceback()

> Error in eval(e, x, parent.frame()) : object 'y' not found
> 4: eval(e, x, parent.frame())
3: eval(e, x, parent.frame())
2: subset.data.frame(data.frame(x = 1), y == 1)
1: subset(data.frame(x = 1), y == 1)
>

It does pollute the call stack, making debugging harder.

 So then I only see a tiny downside for debugging here. No loss of
time/space efficiency or anything. With a change of inadvertent error
in your example, coming from the fact that a variable is being reused
and noone now checks for me whether it is being passed between the
lines. And with having to specify the variable every single time. For
me, that solution is clearly inferior.

There are tradeoffs. As demonstrated above, the pipe is clearly
inferior in that it is doing a lot of complicated stuff under the
hood, and when you try to traceback() through the call stack you have
to sift through all that complicated stuff. That's a pretty big
drawback in my opinion.

>
> Too bad you didn’t find my other comments interesting though.

I did not say that.

>
>>Why do you think being implemented in a contributed package restricts
>>the usefulness of a feature?
>
> I guess it depends on your philosophy. It may not restrict it per say, 
> although it would make a lot of sense to me reusing the bash-style ‘|' and 
> have a shorter, more readable version. One has extra dependence on a package 
> for an item that fits the language so well that it should be its part.  It is 
> without doubt my most used operator at least. Going to some of my folders I 
> found 101 uses in 750 lines, and 132 uses in 3303 lines. I would compare it 
> to having a computer game being really good with a fan-created mod, but 
> lacking otherwise. :)

One of the key strengths of R is that packages are not akin to "fan
created mods". They are a central and necessary part of the R system.

>
> So to me, it makes sense that if there is no doubt that a feature improves 
> the language, and especially if people extensively use it through a package 
> already, it should be part of the “standard”. Question is whether it is 
> indeed very popular, and whether you share my view. But that’s now up to you, 
> I just wanted to point it out I guess.

>
> Best Regards,
> Antonin
>
>> On 05 May 2017, at 22:33, Gabor Grothendieck  wrote:
>>
>> Regarding the anonymous-function-in-a-pipeline point one can already
>> do this which does use brackets but even so it involves fewer
>> characters than the example shown.  Here { . * 2 } is basically a
>> lambda whose argument is dot. Would this be sufficient?
>>
>>  library(magrittr)
>>
>>  1.5 %>% { . * 2 }
>>  ## [1] 3
>>
>> Regarding currying note that with magrittr Ista's code could be written as:
>>
>>  1:5 %>% lapply(foo, y = 3)
>>
>> or at the expense of slightly more verbosity:
>>
>>  1:5 %>% Map(f = . %>% foo(y = 3))
>>
>>
>> On Fri, May 5, 2017 at 1:00 PM, Antonin Klima  wrote:
>>> Dear Sir or Madam,
>>>
>>> I am in 2nd year of my PhD in bioinformatics, after taking my Master’s in 
>>> computer science, and have been using R heavily during my PhD. As such, I 
>>> have put together a list of certain features in R that, in my opinion, 
>>> would be beneficial to add, or could be improved. The first two are already 
>>> implemented in packages, but given that it is implemented as user-defined 
>>> operators, it greatly restricts its usefulness. I hope you will find my 
>>> suggestions interesting. If you find time, I will welcome any feedback as 
>>> to whether you find the suggestions useful, or why you do not think they 
>>> should be implemented. I will also welcome if you enlighten me with any 
>>> features I might be unaware of, that might solve the issues I have pointed 
>>> out below.
>>>
>>> 1) piping
>>> Currently available in package magrittr, piping makes the code better 
>>> readable by having the line start at its natural starting point, and 
>>> following with functions that are applied - 

Re: [Rd] A few suggestions and perspectives from a PhD student

2017-05-08 Thread Hadley Wickham
> There are tradeoffs. As demonstrated above, the pipe is clearly
> inferior in that it is doing a lot of complicated stuff under the
> hood, and when you try to traceback() through the call stack you have
> to sift through all that complicated stuff. That's a pretty big
> drawback in my opinion.

To be precise, that is a problem with the current implementation of
the pipe. It's not a limitation of the pipe per se.

Hadley

-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel