Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Duncan Murdoch

On 06/12/2021 1:14 a.m., Radford Neal wrote:

The TL;DR version is base R support for a `+.character` method. This
would essentially provide a shortcut to `paste0`...


In pqR (see pqR-project.org), I have implemented ! and !! as binary
string concatenation operators, equivalent to paste0 and paste,
respectively.

For instance,

 > "hello" ! "world"
 [1] "helloworld"
 > "hello" !! "world"
 [1] "hello world"
 > "hello" !! 1:4
 [1] "hello 1" "hello 2" "hello 3" "hello 4"


I'm curious about the details:

Would `1 ! 2` convert both to strings?

Where does the binary ! fit in the operator priority?  E.g. how is

  a ! b > c

parsed?

Duncan Murdoch


 
This seems preferable to overloading the + operator, which would lead

to people reading code wondering whether a+b is doing an addition or a
string concatenation.  There are very few circumstances in which one
would want to write code where a+b might be either of these.  So it's
better to make clear what is going on by having a different operator
for string concatenation.

Plus ! and !! semm natural for representing paste0 and paste, whereas
using ++ for paste (with + for paste0) would look rather strange.

Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] install.packages() and Additional_repositories

2021-12-06 Thread Thierry Onkelinx via R-devel
Dear R core team,

Writing R extensions mentions an optional 'Additional_repositories' field
in the DESCRIPTION. (
https://cran.r-project.org/doc/manuals/R-exts.html#Package-Dependencies).
Currently, install.packages() does not use that information when installing
a package. Would you accept a patch to amend this?

If so, should install.packages() use the `Additional_repositories` when
listed? Or should that be based on an extra argument? E.g. additional_repos
= FALSE (default).

Best regards,

Thierry

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkel...@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Great overhead for setTimeLimit?

2021-12-06 Thread Jiefei Wang
Hi all,

>From the document of 'setTimeLimit', it states "Setting any limit has
a small overhead – well under 1% on the systems measured.", but
something is wrong with my benchmark code, enabling the time limit
makes my benchmark 1x slower than the benchmark without the limit.
Below is an example

```
benchFunc <- function(x, data) {
value <- 0
for(i in 1:5000){
for(j in seq_along(data))
value <- value + data[j]
}
value
}
data <- sample(1:10, 10)

setTimeLimit(Inf, Inf, FALSE)
system.time(lapply(1:5000, benchFunc, data = data))

setTimeLimit(999, 999, FALSE)
system.time(lapply(1:5000, benchFunc, data = data))
```

Here are the test results

> setTimeLimit(Inf, Inf, FALSE)
> system.time(lapply(1:5000, benchFunc, data = data))
   user  system elapsed
 10.809   0.006  10.812
> setTimeLimit(999, 999, FALSE)
> system.time(lapply(1:5000, benchFunc, data = data))
   user  system elapsed
 13.634   6.478  20.106


As a side note, it looks like the GC consumes the most CPU time. The
GC costs 10 secs without the time limit, but 19 secs with the limit.
Any thoughts?

Best,
Jiefei

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Radford Neal
> > In pqR (see pqR-project.org), I have implemented ! and !! as binary
> > string concatenation operators, equivalent to paste0 and paste,
> > respectively.
> > 
> > For instance,
> > 
> >  > "hello" ! "world"
> >  [1] "helloworld"
> >  > "hello" !! "world"
> >  [1] "hello world"
> >  > "hello" !! 1:4
> >  [1] "hello 1" "hello 2" "hello 3" "hello 4"
> 
> I'm curious about the details:
> 
> Would `1 ! 2` convert both to strings?

They're equivalent to paste0 and paste, so 1 ! 2 produces "12", just
like paste0(1,2) does.  Of course, they wouldn't have to be exactly
equivalent to paste0 and paste - one could impose stricter
requirements if that seemed better for error detection.  Off hand,
though, I think automatically converting is more in keeping with the
rest of R.  Explicitly converting with as.character could be tedious.

I suppose disallowing logical arguments might make sense to guard
against typos where ! was meant to be the unary-not operator, but
ended up being a binary operator, after some sort of typo.  I doubt
that this would be a common error, though.

(Note that there's no ambiguity when there are no typos, except that
when negation is involved a space may be needed - so, for example, 
"x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  Existing uses of
double negation are still fine - eg, a <- !!TRUE still sets a to TRUE.
Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not "xTRUE".)

> Where does the binary ! fit in the operator priority?  E.g. how is
> 
>   a ! b > c
> 
> parsed?

As (a ! b) > c.

Their precedence is between that of + and - and that of < and >.
So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.  

(Actually, pqR also has a .. operator that fixes the problems with
generating sequences with the : operator, and it has precedence lower
than + and - and higher than ! and !!, but that's not relevant if you
don't have the .. operator.)

   Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Gabriel Becker
Hi All,

Seeing this and the other thread (and admittedly not having clicked through
to the linked r-help thread), I wonder about NAs.

Should NA  "hi there"  not result in NA_character_? This is not
what any of the paste functions do, but in my opinoin, NA + 
seems like it should be NA  (not "NA"), particularly if we are talking
about `+` overloading, but potentially even in the case of a distinct
concatenation operator?

I guess what I'm saying is that in my head missingness propagation rules
should take priority in such an operator (ie NA +  should
*always * be NA).

Is that something others disagree with, or has it just not come up yet in
(the parts I have read) of this discussion?

Best,
~G

On Mon, Dec 6, 2021 at 10:03 AM Radford Neal  wrote:

> > > In pqR (see pqR-project.org), I have implemented ! and !! as binary
> > > string concatenation operators, equivalent to paste0 and paste,
> > > respectively.
> > >
> > > For instance,
> > >
> > >  > "hello" ! "world"
> > >  [1] "helloworld"
> > >  > "hello" !! "world"
> > >  [1] "hello world"
> > >  > "hello" !! 1:4
> > >  [1] "hello 1" "hello 2" "hello 3" "hello 4"
> >
> > I'm curious about the details:
> >
> > Would `1 ! 2` convert both to strings?
>
> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", just
> like paste0(1,2) does.  Of course, they wouldn't have to be exactly
> equivalent to paste0 and paste - one could impose stricter
> requirements if that seemed better for error detection.  Off hand,
> though, I think automatically converting is more in keeping with the
> rest of R.  Explicitly converting with as.character could be tedious.
>
> I suppose disallowing logical arguments might make sense to guard
> against typos where ! was meant to be the unary-not operator, but
> ended up being a binary operator, after some sort of typo.  I doubt
> that this would be a common error, though.
>
> (Note that there's no ambiguity when there are no typos, except that
> when negation is involved a space may be needed - so, for example,
> "x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  Existing uses of
> double negation are still fine - eg, a <- !!TRUE still sets a to TRUE.
> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not "xTRUE".)
>
> > Where does the binary ! fit in the operator priority?  E.g. how is
> >
> >   a ! b > c
> >
> > parsed?
>
> As (a ! b) > c.
>
> Their precedence is between that of + and - and that of < and >.
> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
>
> (Actually, pqR also has a .. operator that fixes the problems with
> generating sequences with the : operator, and it has precedence lower
> than + and - and higher than ! and !!, but that's not relevant if you
> don't have the .. operator.)
>
>Radford Neal
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Avraham Adler
Gabe, I agree that missingness is important to factor in. To somewhat abuse
the terminology, NA is often used to represent missingness. Perhaps
concatenating character something with character something missing should
result in the original character?

Avi

On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker  wrote:

> Hi All,
>
> Seeing this and the other thread (and admittedly not having clicked through
> to the linked r-help thread), I wonder about NAs.
>
> Should NA  "hi there"  not result in NA_character_? This is not
> what any of the paste functions do, but in my opinoin, NA + 
> seems like it should be NA  (not "NA"), particularly if we are talking
> about `+` overloading, but potentially even in the case of a distinct
> concatenation operator?
>
> I guess what I'm saying is that in my head missingness propagation rules
> should take priority in such an operator (ie NA +  should
> *always * be NA).
>
> Is that something others disagree with, or has it just not come up yet in
> (the parts I have read) of this discussion?
>
> Best,
> ~G
>
> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
> wrote:
>
> > > > In pqR (see pqR-project.org), I have implemented ! and !! as binary
> > > > string concatenation operators, equivalent to paste0 and paste,
> > > > respectively.
> > > >
> > > > For instance,
> > > >
> > > >  > "hello" ! "world"
> > > >  [1] "helloworld"
> > > >  > "hello" !! "world"
> > > >  [1] "hello world"
> > > >  > "hello" !! 1:4
> > > >  [1] "hello 1" "hello 2" "hello 3" "hello 4"
> > >
> > > I'm curious about the details:
> > >
> > > Would `1 ! 2` convert both to strings?
> >
> > They're equivalent to paste0 and paste, so 1 ! 2 produces "12", just
> > like paste0(1,2) does.  Of course, they wouldn't have to be exactly
> > equivalent to paste0 and paste - one could impose stricter
> > requirements if that seemed better for error detection.  Off hand,
> > though, I think automatically converting is more in keeping with the
> > rest of R.  Explicitly converting with as.character could be tedious.
> >
> > I suppose disallowing logical arguments might make sense to guard
> > against typos where ! was meant to be the unary-not operator, but
> > ended up being a binary operator, after some sort of typo.  I doubt
> > that this would be a common error, though.
> >
> > (Note that there's no ambiguity when there are no typos, except that
> > when negation is involved a space may be needed - so, for example,
> > "x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  Existing uses of
> > double negation are still fine - eg, a <- !!TRUE still sets a to TRUE.
> > Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not "xTRUE".)
> >
> > > Where does the binary ! fit in the operator priority?  E.g. how is
> > >
> > >   a ! b > c
> > >
> > > parsed?
> >
> > As (a ! b) > c.
> >
> > Their precedence is between that of + and - and that of < and >.
> > So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
> >
> > (Actually, pqR also has a .. operator that fixes the problems with
> > generating sequences with the : operator, and it has precedence lower
> > than + and - and higher than ! and !!, but that's not relevant if you
> > don't have the .. operator.)
> >
> >Radford Neal
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Sent from Gmail Mobile

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Duncan Murdoch

On 06/12/2021 4:21 p.m., Avraham Adler wrote:

Gabe, I agree that missingness is important to factor in. To somewhat abuse
the terminology, NA is often used to represent missingness. Perhaps
concatenating character something with character something missing should
result in the original character?


I think that's a bad idea.  If you wanted to represent an empty string, 
you should use "" or NULL, not NA.


I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it should 
give NA.


Duncan Murdoch



Avi

On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker  wrote:


Hi All,

Seeing this and the other thread (and admittedly not having clicked through
to the linked r-help thread), I wonder about NAs.

Should NA  "hi there"  not result in NA_character_? This is not
what any of the paste functions do, but in my opinoin, NA + 
seems like it should be NA  (not "NA"), particularly if we are talking
about `+` overloading, but potentially even in the case of a distinct
concatenation operator?

I guess what I'm saying is that in my head missingness propagation rules
should take priority in such an operator (ie NA +  should
*always * be NA).

Is that something others disagree with, or has it just not come up yet in
(the parts I have read) of this discussion?

Best,
~G

On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
wrote:


In pqR (see pqR-project.org), I have implemented ! and !! as binary
string concatenation operators, equivalent to paste0 and paste,
respectively.

For instance,

  > "hello" ! "world"
  [1] "helloworld"
  > "hello" !! "world"
  [1] "hello world"
  > "hello" !! 1:4
  [1] "hello 1" "hello 2" "hello 3" "hello 4"


I'm curious about the details:

Would `1 ! 2` convert both to strings?


They're equivalent to paste0 and paste, so 1 ! 2 produces "12", just
like paste0(1,2) does.  Of course, they wouldn't have to be exactly
equivalent to paste0 and paste - one could impose stricter
requirements if that seemed better for error detection.  Off hand,
though, I think automatically converting is more in keeping with the
rest of R.  Explicitly converting with as.character could be tedious.

I suppose disallowing logical arguments might make sense to guard
against typos where ! was meant to be the unary-not operator, but
ended up being a binary operator, after some sort of typo.  I doubt
that this would be a common error, though.

(Note that there's no ambiguity when there are no typos, except that
when negation is involved a space may be needed - so, for example,
"x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  Existing uses of
double negation are still fine - eg, a <- !!TRUE still sets a to TRUE.
Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not "xTRUE".)


Where does the binary ! fit in the operator priority?  E.g. how is

   a ! b > c

parsed?


As (a ! b) > c.

Their precedence is between that of + and - and that of < and >.
So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.

(Actually, pqR also has a .. operator that fixes the problems with
generating sequences with the : operator, and it has precedence lower
than + and - and higher than ! and !!, but that's not relevant if you
don't have the .. operator.)

Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



 [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Bill Dunlap
Should paste0(character(0), c("a","b")) give character(0)?
There is a fair bit of code that assumes that paste("X",NULL) gives "X" but
c(1,2)+NULL gives numeric(0).

-Bill

On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch 
wrote:

> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
> > Gabe, I agree that missingness is important to factor in. To somewhat
> abuse
> > the terminology, NA is often used to represent missingness. Perhaps
> > concatenating character something with character something missing should
> > result in the original character?
>
> I think that's a bad idea.  If you wanted to represent an empty string,
> you should use "" or NULL, not NA.
>
> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it should
> give NA.
>
> Duncan Murdoch
>
> >
> > Avi
> >
> > On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker 
> wrote:
> >
> >> Hi All,
> >>
> >> Seeing this and the other thread (and admittedly not having clicked
> through
> >> to the linked r-help thread), I wonder about NAs.
> >>
> >> Should NA  "hi there"  not result in NA_character_? This is not
> >> what any of the paste functions do, but in my opinoin, NA +
> 
> >> seems like it should be NA  (not "NA"), particularly if we are talking
> >> about `+` overloading, but potentially even in the case of a distinct
> >> concatenation operator?
> >>
> >> I guess what I'm saying is that in my head missingness propagation rules
> >> should take priority in such an operator (ie NA +  should
> >> *always * be NA).
> >>
> >> Is that something others disagree with, or has it just not come up yet
> in
> >> (the parts I have read) of this discussion?
> >>
> >> Best,
> >> ~G
> >>
> >> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
> >> wrote:
> >>
> > In pqR (see pqR-project.org), I have implemented ! and !! as binary
> > string concatenation operators, equivalent to paste0 and paste,
> > respectively.
> >
> > For instance,
> >
> >   > "hello" ! "world"
> >   [1] "helloworld"
> >   > "hello" !! "world"
> >   [1] "hello world"
> >   > "hello" !! 1:4
> >   [1] "hello 1" "hello 2" "hello 3" "hello 4"
> 
>  I'm curious about the details:
> 
>  Would `1 ! 2` convert both to strings?
> >>>
> >>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", just
> >>> like paste0(1,2) does.  Of course, they wouldn't have to be exactly
> >>> equivalent to paste0 and paste - one could impose stricter
> >>> requirements if that seemed better for error detection.  Off hand,
> >>> though, I think automatically converting is more in keeping with the
> >>> rest of R.  Explicitly converting with as.character could be tedious.
> >>>
> >>> I suppose disallowing logical arguments might make sense to guard
> >>> against typos where ! was meant to be the unary-not operator, but
> >>> ended up being a binary operator, after some sort of typo.  I doubt
> >>> that this would be a common error, though.
> >>>
> >>> (Note that there's no ambiguity when there are no typos, except that
> >>> when negation is involved a space may be needed - so, for example,
> >>> "x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  Existing uses of
> >>> double negation are still fine - eg, a <- !!TRUE still sets a to TRUE.
> >>> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not
> "xTRUE".)
> >>>
>  Where does the binary ! fit in the operator priority?  E.g. how is
> 
> a ! b > c
> 
>  parsed?
> >>>
> >>> As (a ! b) > c.
> >>>
> >>> Their precedence is between that of + and - and that of < and >.
> >>> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
> >>>
> >>> (Actually, pqR also has a .. operator that fixes the problems with
> >>> generating sequences with the : operator, and it has precedence lower
> >>> than + and - and higher than ! and !!, but that's not relevant if you
> >>> don't have the .. operator.)
> >>>
> >>> Radford Neal
> >>>
> >>> __
> >>> R-devel@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Gabriel Becker
As I recall, there was a large discussion related to that which resulted in
the recycle0 argument being added (but defaulting to FALSE) for
paste/paste0.

I think a lot of these things ultimately mean that if there were to be a
string concatenation operator, it probably shouldn't have behavior
identical to paste0. Was that what you were getting at as well, Bill?

~G

On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap  wrote:

> Should paste0(character(0), c("a","b")) give character(0)?
> There is a fair bit of code that assumes that paste("X",NULL) gives "X"
> but c(1,2)+NULL gives numeric(0).
>
> -Bill
>
> On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch 
> wrote:
>
>> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
>> > Gabe, I agree that missingness is important to factor in. To somewhat
>> abuse
>> > the terminology, NA is often used to represent missingness. Perhaps
>> > concatenating character something with character something missing
>> should
>> > result in the original character?
>>
>> I think that's a bad idea.  If you wanted to represent an empty string,
>> you should use "" or NULL, not NA.
>>
>> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it should
>> give NA.
>>
>> Duncan Murdoch
>>
>> >
>> > Avi
>> >
>> > On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker 
>> wrote:
>> >
>> >> Hi All,
>> >>
>> >> Seeing this and the other thread (and admittedly not having clicked
>> through
>> >> to the linked r-help thread), I wonder about NAs.
>> >>
>> >> Should NA  "hi there"  not result in NA_character_? This is not
>> >> what any of the paste functions do, but in my opinoin, NA +
>> 
>> >> seems like it should be NA  (not "NA"), particularly if we are talking
>> >> about `+` overloading, but potentially even in the case of a distinct
>> >> concatenation operator?
>> >>
>> >> I guess what I'm saying is that in my head missingness propagation
>> rules
>> >> should take priority in such an operator (ie NA +  should
>> >> *always * be NA).
>> >>
>> >> Is that something others disagree with, or has it just not come up yet
>> in
>> >> (the parts I have read) of this discussion?
>> >>
>> >> Best,
>> >> ~G
>> >>
>> >> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
>> >> wrote:
>> >>
>> > In pqR (see pqR-project.org), I have implemented ! and !! as binary
>> > string concatenation operators, equivalent to paste0 and paste,
>> > respectively.
>> >
>> > For instance,
>> >
>> >   > "hello" ! "world"
>> >   [1] "helloworld"
>> >   > "hello" !! "world"
>> >   [1] "hello world"
>> >   > "hello" !! 1:4
>> >   [1] "hello 1" "hello 2" "hello 3" "hello 4"
>> 
>>  I'm curious about the details:
>> 
>>  Would `1 ! 2` convert both to strings?
>> >>>
>> >>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", just
>> >>> like paste0(1,2) does.  Of course, they wouldn't have to be exactly
>> >>> equivalent to paste0 and paste - one could impose stricter
>> >>> requirements if that seemed better for error detection.  Off hand,
>> >>> though, I think automatically converting is more in keeping with the
>> >>> rest of R.  Explicitly converting with as.character could be tedious.
>> >>>
>> >>> I suppose disallowing logical arguments might make sense to guard
>> >>> against typos where ! was meant to be the unary-not operator, but
>> >>> ended up being a binary operator, after some sort of typo.  I doubt
>> >>> that this would be a common error, though.
>> >>>
>> >>> (Note that there's no ambiguity when there are no typos, except that
>> >>> when negation is involved a space may be needed - so, for example,
>> >>> "x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  Existing uses of
>> >>> double negation are still fine - eg, a <- !!TRUE still sets a to TRUE.
>> >>> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not
>> "xTRUE".)
>> >>>
>>  Where does the binary ! fit in the operator priority?  E.g. how is
>> 
>> a ! b > c
>> 
>>  parsed?
>> >>>
>> >>> As (a ! b) > c.
>> >>>
>> >>> Their precedence is between that of + and - and that of < and >.
>> >>> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
>> >>>
>> >>> (Actually, pqR also has a .. operator that fixes the problems with
>> >>> generating sequences with the : operator, and it has precedence lower
>> >>> than + and - and higher than ! and !!, but that's not relevant if you
>> >>> don't have the .. operator.)
>> >>>
>> >>> Radford Neal
>> >>>
>> >>> __
>> >>> R-devel@r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>>
>> >>
>> >>  [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-devel@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >>
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailm

Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread David Scott
I am surprised nobody so far has mentioned glue which is an 
implementation in R of a python idiom.

It is a reverse import in a great number of R packages on CRAN. It 
specifies how some of the special cases so far considered are treated 
which seems an advantage:

 > library(glue)
 > glue(NA, 2)
NA2
 > glue(NA, 2, .sep = " ")
NA 2
 > glue(NA, 2, .na = NULL)
NA

David Scott

On 7/12/2021 1:20 pm, Gabriel Becker wrote:
> As I recall, there was a large discussion related to that which 
> resulted in
> the recycle0 argument being added (but defaulting to FALSE) for
> paste/paste0.
>
> I think a lot of these things ultimately mean that if there were to be a
> string concatenation operator, it probably shouldn't have behavior
> identical to paste0. Was that what you were getting at as well, Bill?
>
> ~G
>
> On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap  
> wrote:
>
> > Should paste0(character(0), c("a","b")) give character(0)?
> > There is a fair bit of code that assumes that paste("X",NULL) gives "X"
> > but c(1,2)+NULL gives numeric(0).
> >
> > -Bill
> >
> > On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch 
> > wrote:
> >
> >> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
> >> > Gabe, I agree that missingness is important to factor in. To somewhat
> >> abuse
> >> > the terminology, NA is often used to represent missingness. Perhaps
> >> > concatenating character something with character something missing
> >> should
> >> > result in the original character?
> >>
> >> I think that's a bad idea. If you wanted to represent an empty string,
> >> you should use "" or NULL, not NA.
> >>
> >> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it 
> should
> >> give NA.
> >>
> >> Duncan Murdoch
> >>
> >> >
> >> > Avi
> >> >
> >> > On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker 
> >> wrote:
> >> >
> >> >> Hi All,
> >> >>
> >> >> Seeing this and the other thread (and admittedly not having clicked
> >> through
> >> >> to the linked r-help thread), I wonder about NAs.
> >> >>
> >> >> Should NA  "hi there" not result in NA_character_? This 
> is not
> >> >> what any of the paste functions do, but in my opinoin, NA +
> >> 
> >> >> seems like it should be NA (not "NA"), particularly if we are 
> talking
> >> >> about `+` overloading, but potentially even in the case of a 
> distinct
> >> >> concatenation operator?
> >> >>
> >> >> I guess what I'm saying is that in my head missingness propagation
> >> rules
> >> >> should take priority in such an operator (ie NA +  should
> >> >> *always * be NA).
> >> >>
> >> >> Is that something others disagree with, or has it just not come 
> up yet
> >> in
> >> >> (the parts I have read) of this discussion?
> >> >>
> >> >> Best,
> >> >> ~G
> >> >>
> >> >> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
> 
> >> >> wrote:
> >> >>
> >> > In pqR (see pqR-project.org), I have implemented ! and !! as 
> binary
> >> > string concatenation operators, equivalent to paste0 and paste,
> >> > respectively.
> >> >
> >> > For instance,
> >> >
> >> > > "hello" ! "world"
> >> > [1] "helloworld"
> >> > > "hello" !! "world"
> >> > [1] "hello world"
> >> > > "hello" !! 1:4
> >> > [1] "hello 1" "hello 2" "hello 3" "hello 4"
> >> 
> >>  I'm curious about the details:
> >> 
> >>  Would `1 ! 2` convert both to strings?
> >> >>>
> >> >>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", 
> just
> >> >>> like paste0(1,2) does. Of course, they wouldn't have to be exactly
> >> >>> equivalent to paste0 and paste - one could impose stricter
> >> >>> requirements if that seemed better for error detection. Off hand,
> >> >>> though, I think automatically converting is more in keeping 
> with the
> >> >>> rest of R. Explicitly converting with as.character could be 
> tedious.
> >> >>>
> >> >>> I suppose disallowing logical arguments might make sense to guard
> >> >>> against typos where ! was meant to be the unary-not operator, but
> >> >>> ended up being a binary operator, after some sort of typo. I doubt
> >> >>> that this would be a common error, though.
> >> >>>
> >> >>> (Note that there's no ambiguity when there are no typos, except 
> that
> >> >>> when negation is involved a space may be needed - so, for example,
> >> >>> "x" ! !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE". Existing 
> uses of
> >> >>> double negation are still fine - eg, a <- !!TRUE still sets a 
> to TRUE.
> >> >>> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not
> >> "xTRUE".)
> >> >>>
> >>  Where does the binary ! fit in the operator priority? E.g. how is
> >> 
> >>  a ! b > c
> >> 
> >>  parsed?
> >> >>>
> >> >>> As (a ! b) > c.
> >> >>>
> >> >>> Their precedence is between that of + and - and that of < and >.
> >> >>> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
> >> >>>
> >> >>> (Actually, pqR also has a .. operator that fixes the problems with
> >> >>> generating sequences with the : operator, and it has prece

Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Bill Dunlap
>I think a lot of these things ultimately mean that if there were to be a
string >concatenation operator, it probably shouldn't have behavior
identical to >paste0. Was that what you were getting at as well, Bill?

Yes.

On Mon, Dec 6, 2021 at 4:21 PM Gabriel Becker  wrote:

> As I recall, there was a large discussion related to that which resulted
> in the recycle0 argument being added (but defaulting to FALSE) for
> paste/paste0.
>
> I think a lot of these things ultimately mean that if there were to be a
> string concatenation operator, it probably shouldn't have behavior
> identical to paste0. Was that what you were getting at as well, Bill?
>
> ~G
>
> On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap 
> wrote:
>
>> Should paste0(character(0), c("a","b")) give character(0)?
>> There is a fair bit of code that assumes that paste("X",NULL) gives "X"
>> but c(1,2)+NULL gives numeric(0).
>>
>> -Bill
>>
>> On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch 
>> wrote:
>>
>>> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
>>> > Gabe, I agree that missingness is important to factor in. To somewhat
>>> abuse
>>> > the terminology, NA is often used to represent missingness. Perhaps
>>> > concatenating character something with character something missing
>>> should
>>> > result in the original character?
>>>
>>> I think that's a bad idea.  If you wanted to represent an empty string,
>>> you should use "" or NULL, not NA.
>>>
>>> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it should
>>> give NA.
>>>
>>> Duncan Murdoch
>>>
>>> >
>>> > Avi
>>> >
>>> > On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker 
>>> wrote:
>>> >
>>> >> Hi All,
>>> >>
>>> >> Seeing this and the other thread (and admittedly not having clicked
>>> through
>>> >> to the linked r-help thread), I wonder about NAs.
>>> >>
>>> >> Should NA  "hi there"  not result in NA_character_? This is
>>> not
>>> >> what any of the paste functions do, but in my opinoin, NA +
>>> 
>>> >> seems like it should be NA  (not "NA"), particularly if we are talking
>>> >> about `+` overloading, but potentially even in the case of a distinct
>>> >> concatenation operator?
>>> >>
>>> >> I guess what I'm saying is that in my head missingness propagation
>>> rules
>>> >> should take priority in such an operator (ie NA +  should
>>> >> *always * be NA).
>>> >>
>>> >> Is that something others disagree with, or has it just not come up
>>> yet in
>>> >> (the parts I have read) of this discussion?
>>> >>
>>> >> Best,
>>> >> ~G
>>> >>
>>> >> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
>>> >> wrote:
>>> >>
>>> > In pqR (see pqR-project.org), I have implemented ! and !! as binary
>>> > string concatenation operators, equivalent to paste0 and paste,
>>> > respectively.
>>> >
>>> > For instance,
>>> >
>>> >   > "hello" ! "world"
>>> >   [1] "helloworld"
>>> >   > "hello" !! "world"
>>> >   [1] "hello world"
>>> >   > "hello" !! 1:4
>>> >   [1] "hello 1" "hello 2" "hello 3" "hello 4"
>>> 
>>>  I'm curious about the details:
>>> 
>>>  Would `1 ! 2` convert both to strings?
>>> >>>
>>> >>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", just
>>> >>> like paste0(1,2) does.  Of course, they wouldn't have to be exactly
>>> >>> equivalent to paste0 and paste - one could impose stricter
>>> >>> requirements if that seemed better for error detection.  Off hand,
>>> >>> though, I think automatically converting is more in keeping with the
>>> >>> rest of R.  Explicitly converting with as.character could be tedious.
>>> >>>
>>> >>> I suppose disallowing logical arguments might make sense to guard
>>> >>> against typos where ! was meant to be the unary-not operator, but
>>> >>> ended up being a binary operator, after some sort of typo.  I doubt
>>> >>> that this would be a common error, though.
>>> >>>
>>> >>> (Note that there's no ambiguity when there are no typos, except that
>>> >>> when negation is involved a space may be needed - so, for example,
>>> >>> "x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  Existing uses
>>> of
>>> >>> double negation are still fine - eg, a <- !!TRUE still sets a to
>>> TRUE.
>>> >>> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not
>>> "xTRUE".)
>>> >>>
>>>  Where does the binary ! fit in the operator priority?  E.g. how is
>>> 
>>> a ! b > c
>>> 
>>>  parsed?
>>> >>>
>>> >>> As (a ! b) > c.
>>> >>>
>>> >>> Their precedence is between that of + and - and that of < and >.
>>> >>> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
>>> >>>
>>> >>> (Actually, pqR also has a .. operator that fixes the problems with
>>> >>> generating sequences with the : operator, and it has precedence lower
>>> >>> than + and - and higher than ! and !!, but that's not relevant if you
>>> >>> don't have the .. operator.)
>>> >>>
>>> >>> Radford Neal
>>> >>>
>>> >>> __
>>> >>> R-devel

Re: [Rd] string concatenation operator (revisited)

2021-12-06 Thread Avi Gross via R-devel
After seeing what others are saying, it is clear that you need to carefully
think things out before designing any implementation of a more native
concatenation operator whether it is called "+' or anything else. There may
not be any ONE right solution but unlike a function version like paste()
there is nowhere to place any options that specify what you mean.

You can obviously expand paste() to accept arguments like replace.NA="" or
replace.NA="" and similar arguments on what to do if you see a NaN, and
Inf or -Inf, a NULL or even an NA.character_ and so on. Heck, you might tell
to make other substitutions as in substitute=list(100=99, D=F) or any other
nonsense you can come up with.

But you have nowhere to put options when saying:

c <- a + b

Sure, you could set various global options before the addition and maybe
rest them after, but that is not a way I like to go for something this
basic.

And enough such tinkering makes me wonder if it is easier to ask a user to
use a slightly different function like this:

paste.no.na <- function(...) do.call(paste, Filter(Negate(is.na),
list(...)))

The above one-line function removes any NA from the argument list to make a
potentially shorter list before calling the real paste() using it.

Variations can, of course, be made that allow functionality as above. 

If R was a true object-oriented language in the same sense as others like
Python, operator overloading of "+" might be doable in more complex ways but
we can only work with what we have. I tend to agree with others that in some
places R is so lenient that all kinds of errors can happen because it makes
a guess on how to correct it. Generally, if you really want to mix numeric
and character, many languages require you to transform any arguments to make
all of compatible types. The paste() function is clearly stated to coerce
all arguments to be of type character for you. Whereas a+b makes no such
promises and also is not properly defined even if a and b are both of type
character. Sure, we can expand the language but it may still do things some
find not to be quite what they wanted as in "2"+"3" becoming "23" rather
than 5. Right now, I can use as.numeric("2")+as.numeric("3") and get the
intended result after making very clear to anyone reading the code that I
wanted strings converted to floating point before the addition.

As has been pointed out, the plus operator if used to concatenate does not
have a cognate for other operations like -*/ and R has used most other
special symbols for other purposes. So, sure, we can use something like 
(4 periods) if it is not already being used for something but using + here
is a tad confusing. Having said that, the makers of Python did make that
choice.

-Original Message-
From: R-devel  On Behalf Of Gabriel Becker
Sent: Monday, December 6, 2021 7:21 PM
To: Bill Dunlap 
Cc: Radford Neal ; r-devel 
Subject: Re: [Rd] string concatenation operator (revisited)

As I recall, there was a large discussion related to that which resulted in
the recycle0 argument being added (but defaulting to FALSE) for
paste/paste0.

I think a lot of these things ultimately mean that if there were to be a
string concatenation operator, it probably shouldn't have behavior identical
to paste0. Was that what you were getting at as well, Bill?

~G

On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap  wrote:

> Should paste0(character(0), c("a","b")) give character(0)?
> There is a fair bit of code that assumes that paste("X",NULL) gives "X"
> but c(1,2)+NULL gives numeric(0).
>
> -Bill
>
> On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch 
> 
> wrote:
>
>> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
>> > Gabe, I agree that missingness is important to factor in. To 
>> > somewhat
>> abuse
>> > the terminology, NA is often used to represent missingness. Perhaps 
>> > concatenating character something with character something missing
>> should
>> > result in the original character?
>>
>> I think that's a bad idea.  If you wanted to represent an empty 
>> string, you should use "" or NULL, not NA.
>>
>> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it 
>> should give NA.
>>
>> Duncan Murdoch
>>
>> >
>> > Avi
>> >
>> > On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker 
>> > 
>> wrote:
>> >
>> >> Hi All,
>> >>
>> >> Seeing this and the other thread (and admittedly not having 
>> >> clicked
>> through
>> >> to the linked r-help thread), I wonder about NAs.
>> >>
>> >> Should NA  "hi there"  not result in NA_character_? This 
>> >> is not what any of the paste functions do, but in my opinoin, NA +
>> 
>> >> seems like it should be NA  (not "NA"), particularly if we are 
>> >> talking about `+` overloading, but potentially even in the case of 
>> >> a distinct concatenation operator?
>> >>
>> >> I guess what I'm saying is that in my head missingness propagation
>> rules
>> >> should take priority in such an operator (ie NA +  
>> >> should *always * be NA).
>> >>
>> >> Is that something others