I fully agree! General string interpolation opens a gaping security hole and is
accompanied by all kinds of problems and decisions. What I envision instead is
something like this:
f”hello {name}”
Which gets parsed by R to this:
(STRINTERPSXP (CHARSXP (PROMISE nil)))
Basically, a new type of R language construct that still can be processed by
packages (for customized interpolation like in cli etc.), with a default eval
which is basically paste0(). The benefit here would be that this is eagerly
parsed and syntactically checked, and that the promise code could carry a
srcref. And of course, that you could pass an interpolated string expression
lazily between frames without losing the environment etc… For more advanced
applications, a low level string interpolation expression constructor could be
provided (that could either parse a general string — at the user’s risk, or
build it directly from expressions).
— Taras
> On 7 Dec 2021, at 12:06, Simon Urbanek <[email protected]> wrote:
>
>
>
>> On Dec 7, 2021, at 22:09, Taras Zakharko <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Great summary, Avi.
>>
>> String concatenation cold be trivially added to R, but it probably should
>> not be. You will notice that modern languages tend not to use “+” to do
>> string concatenation (they either have
>> a custom operator or a special kind of pattern to do it) due to practical
>> issues such an approach brings (implicit type casting, lack of
>> commutativity, performance etc.). These issues will be felt even more so in
>> R with it’s weak typing, idiosyncratic casting behavior and NAs.
>>
>> As other’s have pointed out, any kind of behavior one wants from string
>> concatenation can be implemented by custom operators as needed. This is not
>> something that needs to be in the base R. I would rather like the efforts to
>> be directed on improving string formatting (such as glue-style built-in
>> string interpolation).
>>
>
> This is getting OT, but there is a very good reason why string interpolation
> is not in core R. As I recall it has been considered some time ago, but it is
> very dangerous as it implies evaluation on constants which opens a huge
> security hole and has questionable semantics (where you evaluate etc). Hence
> it's much easier to ban a package than to hack it out of R ;).
>
> Cheers,
> Simon
>
>
>> — Taras
>>
>>
>>> On 7 Dec 2021, at 02:27, Avi Gross via R-devel <[email protected]>
>>> wrote:
>>>
>>> After seeing what others are saying, it is clear that you need to carefully
>>> think things out before designing any implementation of a more native
>>> concatenation operator whether it is called "+' or anything else. There may
>>> not be any ONE right solution but unlike a function version like paste()
>>> there is nowhere to place any options that specify what you mean.
>>>
>>> You can obviously expand paste() to accept arguments like replace.NA="" or
>>> replace.NA="<NA>" and similar arguments on what to do if you see a NaN, and
>>> Inf or -Inf, a NULL or even an NA.character_ and so on. Heck, you might tell
>>> to make other substitutions as in substitute=list(100=99, D=F) or any other
>>> nonsense you can come up with.
>>>
>>> But you have nowhere to put options when saying:
>>>
>>> c <- a + b
>>>
>>> Sure, you could set various global options before the addition and maybe
>>> rest them after, but that is not a way I like to go for something this
>>> basic.
>>>
>>> And enough such tinkering makes me wonder if it is easier to ask a user to
>>> use a slightly different function like this:
>>>
>>> paste.no.na <- function(...) do.call(paste, Filter(Negate(is.na),
>>> list(...)))
>>>
>>> The above one-line function removes any NA from the argument list to make a
>>> potentially shorter list before calling the real paste() using it.
>>>
>>> Variations can, of course, be made that allow functionality as above.
>>>
>>> If R was a true object-oriented language in the same sense as others like
>>> Python, operator overloading of "+" might be doable in more complex ways but
>>> we can only work with what we have. I tend to agree with others that in some
>>> places R is so lenient that all kinds of errors can happen because it makes
>>> a guess on how to correct it. Generally, if you really want to mix numeric
>>> and character, many languages require you to transform any arguments to make
>>> all of compatible types. The paste() function is clearly stated to coerce
>>> all arguments to be of type character for you. Whereas a+b makes no such
>>> promises and also is not properly defined even if a and b are both of type
>>> character. Sure, we can expand the language but it may still do things some
>>> find not to be quite what they wanted as in "2"+"3" becoming "23" rather
>>> than 5. Right now, I can use as.numeric("2")+as.numeric("3") and get the
>>> intended result after making very clear to anyone reading the code that I
>>> wanted strings converted to floating point before the addition.
>>>
>>> As has been pointed out, the plus operator if used to concatenate does not
>>> have a cognate for other operations like -*/ and R has used most other
>>> special symbols for other purposes. So, sure, we can use something like ....
>>> (4 periods) if it is not already being used for something but using + here
>>> is a tad confusing. Having said that, the makers of Python did make that
>>> choice.
>>>
>>> -----Original Message-----
>>> From: R-devel <[email protected]> On Behalf Of Gabriel Becker
>>> Sent: Monday, December 6, 2021 7:21 PM
>>> To: Bill Dunlap <[email protected]>
>>> Cc: Radford Neal <[email protected]>; r-devel <[email protected]>
>>> Subject: Re: [Rd] string concatenation operator (revisited)
>>>
>>> As I recall, there was a large discussion related to that which resulted in
>>> the recycle0 argument being added (but defaulting to FALSE) for
>>> paste/paste0.
>>>
>>> I think a lot of these things ultimately mean that if there were to be a
>>> string concatenation operator, it probably shouldn't have behavior identical
>>> to paste0. Was that what you were getting at as well, Bill?
>>>
>>> ~G
>>>
>>> On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap <[email protected]> wrote:
>>>
>>>> Should paste0(character(0), c("a","b")) give character(0)?
>>>> There is a fair bit of code that assumes that paste("X",NULL) gives "X"
>>>> but c(1,2)+NULL gives numeric(0).
>>>>
>>>> -Bill
>>>>
>>>> On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch
>>>> <[email protected]>
>>>> wrote:
>>>>
>>>>> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
>>>>>> Gabe, I agree that missingness is important to factor in. To
>>>>>> somewhat
>>>>> abuse
>>>>>> the terminology, NA is often used to represent missingness. Perhaps
>>>>>> concatenating character something with character something missing
>>>>> should
>>>>>> result in the original character?
>>>>>
>>>>> I think that's a bad idea. If you wanted to represent an empty
>>>>> string, you should use "" or NULL, not NA.
>>>>>
>>>>> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it
>>>>> should give NA.
>>>>>
>>>>> Duncan Murdoch
>>>>>
>>>>>>
>>>>>> Avi
>>>>>>
>>>>>> On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker
>>>>>> <[email protected]>
>>>>> wrote:
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> Seeing this and the other thread (and admittedly not having
>>>>>>> clicked
>>>>> through
>>>>>>> to the linked r-help thread), I wonder about NAs.
>>>>>>>
>>>>>>> Should NA <concat> "hi there" not result in NA_character_? This
>>>>>>> is not what any of the paste functions do, but in my opinoin, NA +
>>>>> <non_na_value>
>>>>>>> seems like it should be NA (not "NA"), particularly if we are
>>>>>>> talking about `+` overloading, but potentially even in the case of
>>>>>>> a distinct concatenation operator?
>>>>>>>
>>>>>>> I guess what I'm saying is that in my head missingness propagation
>>>>> rules
>>>>>>> should take priority in such an operator (ie NA + <anything>
>>>>>>> should *always * be NA).
>>>>>>>
>>>>>>> Is that something others disagree with, or has it just not come up
>>>>>>> yet
>>>>> in
>>>>>>> (the parts I have read) of this discussion?
>>>>>>>
>>>>>>> Best,
>>>>>>> ~G
>>>>>>>
>>>>>>> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal
>>>>>>> <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>>> In pqR (see pqR-project.org), I have implemented ! and !! as
>>>>>>>>>> binary string concatenation operators, equivalent to paste0 and
>>>>>>>>>> paste, respectively.
>>>>>>>>>>
>>>>>>>>>> For instance,
>>>>>>>>>>
>>>>>>>>>>> "hello" ! "world"
>>>>>>>>>> [1] "helloworld"
>>>>>>>>>>> "hello" !! "world"
>>>>>>>>>> [1] "hello world"
>>>>>>>>>>> "hello" !! 1:4
>>>>>>>>>> [1] "hello 1" "hello 2" "hello 3" "hello 4"
>>>>>>>>>
>>>>>>>>> I'm curious about the details:
>>>>>>>>>
>>>>>>>>> Would `1 ! 2` convert both to strings?
>>>>>>>>
>>>>>>>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12",
>>>>>>>> just like paste0(1,2) does. Of course, they wouldn't have to be
>>>>>>>> exactly equivalent to paste0 and paste - one could impose
>>>>>>>> stricter requirements if that seemed better for error detection.
>>>>>>>> Off hand, though, I think automatically converting is more in
>>>>>>>> keeping with the rest of R. Explicitly converting with as.character
>>> could be tedious.
>>>>>>>>
>>>>>>>> I suppose disallowing logical arguments might make sense to guard
>>>>>>>> against typos where ! was meant to be the unary-not operator, but
>>>>>>>> ended up being a binary operator, after some sort of typo. I
>>>>>>>> doubt that this would be a common error, though.
>>>>>>>>
>>>>>>>> (Note that there's no ambiguity when there are no typos, except
>>>>>>>> that when negation is involved a space may be needed - so, for
>>>>>>>> example, "x" ! !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".
>>>>>>>> Existing uses of double negation are still fine - eg, a <- !!TRUE
>>> still sets a to TRUE.
>>>>>>>> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not
>>>>> "xTRUE".)
>>>>>>>>
>>>>>>>>> Where does the binary ! fit in the operator priority? E.g. how
>>>>>>>>> is
>>>>>>>>>
>>>>>>>>> a ! b > c
>>>>>>>>>
>>>>>>>>> parsed?
>>>>>>>>
>>>>>>>> As (a ! b) > c.
>>>>>>>>
>>>>>>>> Their precedence is between that of + and - and that of < and >.
>>>>>>>> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
>>>>>>>>
>>>>>>>> (Actually, pqR also has a .. operator that fixes the problems
>>>>>>>> with generating sequences with the : operator, and it has
>>>>>>>> precedence lower than + and - and higher than ! and !!, but
>>>>>>>> that's not relevant if you don't have the .. operator.)
>>>>>>>>
>>>>>>>> Radford Neal
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> [email protected] mailing list
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>>
>>>>>>>
>>>>>>> [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> [email protected] mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> [email protected] mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [email protected] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> [email protected] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> ______________________________________________
>> [email protected] <mailto:[email protected]> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> <https://stat.ethz.ch/mailman/listinfo/r-devel>
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel