[Rd] Should seq.Date() return double storage?

2021-09-07 Thread Michael Chirico via R-devel
today <- Sys.Date()
typeof(today)
# [1] "double"
typeof(seq(today, by=1, length.out=2))
# [1] "integer"

Clearly minor as it doesn't seem to have come up before (e.g. coercion
to numeric will happen automatically whenever fractional dates are
needed); I only noticed because of a test using identical failing:

identical(
  seq(today, by=1, length.out=10),
  today + 0:9
)
# [1] FALSE

It's easy in this case to fix the test using coercion, but this could
(and did in practice) come up at deeper levels of nesting that become
more onerous to handle. And using all.equal() comes with its own
tradeoffs.

The fix would be easy enough -- at a glance there are two usages of
.Date(seq.int(...)) in seq.Date() that could be replaced by
.Date(as.numeric(seq.int(...))).

Mike C

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] .onLoad, packageStartupMessage, and R CMD check

2021-11-04 Thread Michael Chirico via R-devel
I wrote a linter to stop users from using packageStartupMessage() in
their .onLoad() hook because of the R CMD check warning it triggers:

https://github.com/wch/r-source/blob/8b6625e39cd62424dc23399dade37f20fa8afa91/src/library/tools/R/QC.R#L5167

However, this received some pushback which I ultimately agree with,
and moreover ?.onLoad seems to agree as well:

> Loading a namespace should where possible be silent, with startup
messages given by \code{.onAttach}. These messages (**and any essential
ones from \code{.onLoad}**) should use \code{\link{packageStartupMessage}}
so they can be silenced where they would be a distraction.

**emphasis** mine. That is, if we think some message is _essential_ to
print during loadNamespace(), we are told to use
packageStartupMessage().

Should we remove this R CMD check warning?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] .onLoad, packageStartupMessage, and R CMD check

2021-11-05 Thread Michael Chirico via R-devel
Examining more closely, it's a NOTE produced by R CMD check --
originally I had thought it was a WARNING, which I think would have
been too strong for this case. A NOTE actually seems fine, on second
thought.

For a tiny bit of context, it's common for us to issue messaging
around some state initialization, which has to happen after some
(ex-ante unknown) set of packages are loaded. It's important to do so
whether or not the package is attached, so the proviso in .onLoad()
indeed makes the most sense.

Thanks!

On Thu, Nov 4, 2021 at 1:02 PM Gabriel Becker  wrote:
>
> Hi Michael,
>
> Indeed, just to elaborate further on what I believe Duncan's point is, can 
> you give any examples, "dire" or not, that are appropriate when the package 
> is loaded but not attached (ie none of its symbols are visible to the user 
> without using :::)?
>
> The only things I can think of are a package that changes the behavior of 
> other, attached package code, such as conflicted. Doing so is very much an 
> anti-pattern imo generally, with something like conflicted being an 
> (arguable) exception. And that's assuming conflicted even works/does anything 
> when loaded but not attached (I have not confirmed whether thats the case or 
> not). That or a package that is at end-of-life and is or soon will be 
> unsupported entirely.
>
> The examples don't need to be yours, per se, if you know what those pushing 
> back against your linter were using messages from .onLoad for...
>
> Best,
> ~G
>
>
>
> On Thu, Nov 4, 2021 at 12:37 PM Duncan Murdoch  
> wrote:
>>
>> On 04/11/2021 2:50 p.m., Michael Chirico via R-devel wrote:
>> > I wrote a linter to stop users from using packageStartupMessage() in
>> > their .onLoad() hook because of the R CMD check warning it triggers:
>> >
>> > https://github.com/wch/r-source/blob/8b6625e39cd62424dc23399dade37f20fa8afa91/src/library/tools/R/QC.R#L5167
>> >
>> > However, this received some pushback which I ultimately agree with,
>> > and moreover ?.onLoad seems to agree as well:
>> >
>> >> Loading a namespace should where possible be silent, with startup
>> > messages given by \code{.onAttach}. These messages (**and any essential
>> > ones from \code{.onLoad}**) should use \code{\link{packageStartupMessage}}
>> > so they can be silenced where they would be a distraction.
>> >
>> > **emphasis** mine. That is, if we think some message is _essential_ to
>> > print during loadNamespace(), we are told to use
>> > packageStartupMessage().
>> >
>> > Should we remove this R CMD check warning?
>>
>> The help page doesn't define what an "essential" message would be, but I
>> would assume it's a message about some dire condition, not just "Hi! I
>> just got loaded!".  So I think a note or warning would be appropriate,
>> but not an error.
>>
>> Do you have an example of something that should routinely print, but
>> that triggers a warning when checked?
>>
>> Duncan Murdoch
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Shouldn't "Loading" be "Attaching" when referring to library() calls?

2023-01-09 Thread Michael Chirico via R-devel
require() and library() both emit this message immediately before
running library():

"Loading required package: %s"

https://github.com/r-devel/r-svn/blob/4198a2941b702d965bb2374c2b908f48c369f40a/src/library/base/R/library.R#L967-L968

https://github.com/r-devel/r-svn/blob/4198a2941b702d965bb2374c2b908f48c369f40a/src/library/base/R/library.R#L684-L685

Shouldn't "Loading" be "Attaching" instead?

My understanding is "a package is loaded" is equivalent to
"isNamespaceLoaded()", i.e., loadNamespace() was run. And that "a
package is attached" is equivalent to "pkg %in% .packages()", i.e.,
library(pkg) was run.

Is the terminology not so precise?

There's certainly ambiguity in the internal variable names referenced
above -- in require, we see

loaded <- paste0("package:", package) %in% search()
https://github.com/r-devel/r-svn/blob/4198a2941b702d965bb2374c2b908f48c369f40a/src/library/base/R/library.R#L680

Whereas in library() [via .getRequiredPackages2()] we see

attached <- paste0("package:", pkg) %in% search()
https://github.com/r-devel/r-svn/blob/4198a2941b702d965bb2374c2b908f48c369f40a/src/library/base/R/library.R#L931

Mike C

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Line-terminal \ in character consants -- missing from ?Quotes ?

2023-02-11 Thread Michael Chirico via R-devel
I'm coming across some code that uses the fact the parser ignores a
line-terminal '\', e.g.

identical("\
", "\n")
# [1] TRUE

x = "abc \
def"
y = "abc \ndef"
identical(x, y)
# [1] TRUE

However:
identical("\\n", "\n")
# [1] FALSE

This appears to be undocumented behavior; the closest thing I see in
?Quotes suggests it should be an error:

> Escaping a character not in the following table is an error.

('\n' is in the table, but my understanding is the 'n' is what's being
escaped v-a-v the "error", which seems confirmed by the third, FALSE,
example above)

Is this a bug, is the omission from ?Quotes a bug, or is this just
undocumented behavior?

Mike C

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Line-terminal \ in character consants -- missing from ?Quotes ?

2023-02-12 Thread Michael Chirico via R-devel
I'm still hung up on ?Quotes -- I can't see mention of 'newline' as a
valid escape. It mentions the literal sequence '\' 'n', where 'n' is
being escaped.

Glanced at the parser blame and apparently the terminal '\' is the
older behavior, and what I'm used to, i.e. literal newlines in char
constants to make multi-line strings, is new (though still 20 years
old):

https://github.com/r-devel/r-svn/commit/bc3f20e4e686be556877bb6bd2882ae8029fd17f

The NEWS entry there does say the same thing as you -- "escaping the
newlines with backslashes".

>From the parser, I think ?Quotes is just missing "newline" as being a
valid escaped character, c.f.

https://github.com/r-devel/r-svn/blob/f55b24945d56e824f124638c596b99887441354a/src/main/gram.y#L2823-L2830
('\n' is treated like '\')
https://github.com/r-devel/r-svn/blob/f55b24945d56e824f124638c596b99887441354a/src/main/gram.y#L2978-L3008
('\n' is in the list of valid items after '\')

I don't see any special handling for '\r', so there may be a gap in
the R parser? Or I just don't understand what I'm reading in the
parser :)

Mike C

On Sun, Feb 12, 2023 at 3:38 AM Duncan Murdoch  wrote:
>
> On 12/02/2023 12:07 a.m., Michael Chirico via R-devel wrote:
> > I'm coming across some code that uses the fact the parser ignores a
> > line-terminal '\', e.g.
> >
> > identical("\
> > ", "\n")
> > # [1] TRUE
> >
> > x = "abc \
> > def"
> > y = "abc \ndef"
> > identical(x, y)
> > # [1] TRUE
> >
> > However:
> > identical("\\n", "\n")
> > # [1] FALSE
> >
> > This appears to be undocumented behavior; the closest thing I see in
> > ?Quotes suggests it should be an error:
> >
> >> Escaping a character not in the following table is an error.
> >
> > ('\n' is in the table, but my understanding is the 'n' is what's being
> > escaped v-a-v the "error", which seems confirmed by the third, FALSE,
> > example above)
> >
> > Is this a bug, is the omission from ?Quotes a bug, or is this just
> > undocumented behavior?
>
> In your first example, you have a backslash which says to escape the
> next char.  The next char is a newline char.  The result is an escaped
> newline, which apparently is a newline.
>
> The same thing happens in the second example.
>
> The third example is an escaped backslash, i.e. a backslash, followed by
> n.  That's not the same as an escaped n, which gives a newline.
>
> So I think the behaviour might be reasonable.
>
> The thing I'd worry about is whether things are handled properly on
> Windows, where the newline is two characters (CR LF).  It might be that
> the backslash at the end of the line escapes the CR, and you get a \r
> out of it instead of a \n.  But maybe not, the parser knows about CR LF
> and internally converts it to \n, so if that happens early enough,
> things would be fine.
>
> Duncan Murdoch
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] FR: names= argument for load()

2023-02-28 Thread Michael Chirico via R-devel
I have three use cases in mind for an argument to load specifying
which variables to load from the input 'file':

1. Avoid overwriting existing variables. ?load recommends using envir=
or attach() to avoid overwrites. An argument like names= would allow
even finer control to avoid collisions.
2. Avoid loading too many (in quantity or in memory size) variables
from a large file. We might save dozens or hundreds of models to the
same file and then load them more lazily.
3. Helping static analysis. Currently, when load() is used in a
script, it becomes impossible to distinguish truly undefined variables
from those defined implicitly by load(). With a names= argument, it
can be possible to know statically precisely which names were
introduced by load, and which might be a bug. (of course there's
nothing stopping authors from having non-literal entries to names=,
but that's a more general issue of static analysis).

Michael Chirico

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] tab-complete for non-syntactic names could attempt backtick-wrapping

2023-03-01 Thread Michael Chirico via R-devel
Consider:

x <- list(`a b` = 1)
x$a

(i.e., press the 'tab' key after typing 'x$a')

The auto-complete mechanism will fill the buffer like so:
x$a b

This is not particularly helpful because this is now a syntax error.

It seems to me there's a simple fix -- in
utils:::specialCompletions(), we can wrap the result of
utils:::specialOpCompletionsHelper() with backticks for non-syntactic
names ([1]):

comps <- specialOpCompletionsHelper(op, suffix, prefix)
if (length(comps) == 0L) comps <- ""
+non_syntactic <- make.names(comps) != comps
+comps[non_syntactic] <- paste0("`", comps[non_syntactic], "`")
sprintf("%s%s%s", prefix, op, comps)

I'm somewhat surprised this hasn't come up before (I searched for
'completeToken', 'specialCompletions', and
'specialOpCompletionsHelper' here and on Bugzilla), so I'm checking
with the list first if I'm missing anything before filing a patch.

Mike C

[1] 
https://github.com/r-devel/r-svn/blob/4657f65a377cb5ef318c6548bc264e3b0f9517a0/src/library/utils/R/completion.R#L536-L538

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] tab-complete for non-syntactic names could attempt backtick-wrapping

2023-03-01 Thread Michael Chirico via R-devel
Great suggestion! I've started a patch:
https://bugs.r-project.org/show_bug.cgi?id=18479

On Wed, Mar 1, 2023 at 1:56 AM Ivan Krylov  wrote:
>
> В Wed, 1 Mar 2023 01:36:02 -0800
> Michael Chirico via R-devel  пишет:
>
> > +comps[non_syntactic] <- paste0("`", comps[non_syntactic], "`")
>
> There are a few more corner cases. For example, comps could contain
> backticks (which should be escaped with backslashes) and backslashes
> (which should also be escaped). Thankfully, \u-style Unicode escape
> sequences are not currently supported inside backticks, and "escape the
> backslash" rule already takes care of them.
>
> The deparse() function already knows these rules:
>
> name <- 'hello world ` \\uFF'
> cat(deparse1(as.name(name), backtick=TRUE), '\n')
> # `hello world \` \\uFF`
> `hello world \` \\uFF` <- 'hello'
> `hello world \` \\uFF`
> # [1] "hello"
>
> --
> Best regards,
> Ivan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] tab-complete for non-syntactic names could attempt backtick-wrapping

2023-03-02 Thread Michael Chirico via R-devel
I personally wouldn't like using a string, and this comment makes me
think it's against the r-core preference as well:

https://bugs.r-project.org/show_bug.cgi?id=18429#c1

Thanks both for catching the sloppy mistake in vapply() :)

Let's continue discussion on the bug/PR.

On Thu, Mar 2, 2023 at 12:39 AM Ivan Krylov  wrote:
>
> There turn out to be a few more things to fix.
>
> One problem is easy to solve: vapply() needs a third argument
> specifying the type of the return value. (Can we have unit tests for
> tab completion?)
>
> The other problem is harder: `comps` defaults to an empty string, and
> you can't have a symbol consisting of an empty string, because this
> value is internally reserved for missing function arguments. I think
> you can return(paste0(prefix, op)) if length(comps) == 0L, but this is
> still somewhat worrying. R tries to prevent empty names, so I wouldn't
> expect specialOpCompletionsHelper() to return an empty string, but I
> can't prove it right now.
>
> On the other hand, x$'a string' is the same as x$`a string`. Could we
> just drop as.name() and keep the return value being a string literal?
> I'm not sure about this, either.
>
> --
> Best regards,
> Ivan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Improving user-friendliness of S4 dispatch failure when mis-naming arguments?

2023-08-09 Thread Michael Chirico via R-devel
I fielded a debugging request from a non-expert user today. At root
was running the following:

dbGetQuery(connection = conn, query = query)

The problem is that they've named the arguments incorrectly -- it
should have been [1]:

dbGetQuery(conn = conn, statement = query)

The problem is that the error message "looks" highly confusing to the
untrained eye:

Error in (function (classes, fdef, mtable)  :   unable to find an
inherited method for function ‘dbGetQuery’ for signature ‘"missing",
"missing"’

In retrospect, of course, this makes sense -- the mis-named arguments
are getting picked up by '...', leaving the required arguments
missing.

But I was left wondering how we could help users right their own ship here.

Would it help to mention the argument names? To include some code
checking for weird combinations of missing arguments? Any other
suggestions?

Mike C

[1] 
https://github.com/r-dbi/DBI/blob/97934c885749dd87a6beb10e8ccb6a5ebea3675e/R/dbGetQuery.R#L62-L64

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Improving user-friendliness of S4 dispatch failure when mis-naming arguments?

2023-08-10 Thread Michael Chirico via R-devel
I forwarded that along to the original reporter with positive feedback
-- including the argument names is definitely a big help for cuing
what exactly is missing.

Would a patch to do something similar for S4 be useful?

On Thu, Aug 10, 2023 at 6:46 AM Hadley Wickham  wrote:
>
> Hi Michael,
>
> I can't help with S4, but I can help to make sure this isn't a problem
> with S7. What do you think of the current error message? Do you see
> anything obvious we could do to improve?
>
> library(S7)
>
> dbGetQuery <- new_generic("dbGetQuery", c("conn", "statement"))
> dbGetQuery(connection = NULL, query = NULL)
> #> Error: Can't find method for generic `dbGetQuery(conn, statement)`:
> #> - conn     : MISSING
> #> - statement: MISSING
>
> Hadley
>
> On Wed, Aug 9, 2023 at 10:02 PM Michael Chirico via R-devel
>  wrote:
> >
> > I fielded a debugging request from a non-expert user today. At root
> > was running the following:
> >
> > dbGetQuery(connection = conn, query = query)
> >
> > The problem is that they've named the arguments incorrectly -- it
> > should have been [1]:
> >
> > dbGetQuery(conn = conn, statement = query)
> >
> > The problem is that the error message "looks" highly confusing to the
> > untrained eye:
> >
> > Error in (function (classes, fdef, mtable)  :   unable to find an
> > inherited method for function ‘dbGetQuery’ for signature ‘"missing",
> > "missing"’
> >
> > In retrospect, of course, this makes sense -- the mis-named arguments
> > are getting picked up by '...', leaving the required arguments
> > missing.
> >
> > But I was left wondering how we could help users right their own ship here.
> >
> > Would it help to mention the argument names? To include some code
> > checking for weird combinations of missing arguments? Any other
> > suggestions?
> >
> > Mike C
> >
> > [1] 
> > https://github.com/r-dbi/DBI/blob/97934c885749dd87a6beb10e8ccb6a5ebea3675e/R/dbGetQuery.R#L62-L64
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Improving user-friendliness of S4 dispatch failure when mis-naming arguments?

2023-08-10 Thread Michael Chirico via R-devel
Here's a trivial patch that offers some improvement:

Index: src/library/methods/R/methodsTable.R
===
--- src/library/methods/R/methodsTable.R (revision 84931)
+++ src/library/methods/R/methodsTable.R (working copy)
@@ -752,11 +752,12 @@
   if(length(methods) == 1L)
 return(methods[[1L]]) # the method
   else if(length(methods) == 0L) {
-cnames <- paste0("\"", vapply(classes, as.character, ""), "\"",
+cnames <- paste0(head(fdef@signature, length(classes)), "=\"",
vapply(classes, as.character, ""), "\"",
  collapse = ", ")
 stop(gettextf("unable to find an inherited method for function %s
for signature %s",
   sQuote(fdef@generic),
   sQuote(cnames)),
+ call. = FALSE,
  domain = NA)
   }
   else

Here's the upshot for the example on DBI:

dbGetQuery(connection = conn, query = query)
Error: unable to find an inherited method for function ‘dbGetQuery’
for signature ‘conn="missing", statement="missing"’

I don't have any confidence about edge cases / robustness of this
patch for generic S4 use cases (make check-all seems fine), but I
don't suppose a full patch would be dramatically different from the
above.

Mike C

On Thu, Aug 10, 2023 at 12:39 PM Gabriel Becker  wrote:
>
> I just want to add my 2 cents that I think it would be very useful and 
> beneficial to improve S4 to surface that information as well.
>
> More information about the way that the dispatch failed would be of great 
> help in situations like the one Michael pointed out.
>
> ~G
>
> On Thu, Aug 10, 2023 at 9:59 AM Michael Chirico via R-devel 
>  wrote:
>>
>> I forwarded that along to the original reporter with positive feedback
>> -- including the argument names is definitely a big help for cuing
>> what exactly is missing.
>>
>> Would a patch to do something similar for S4 be useful?
>>
>> On Thu, Aug 10, 2023 at 6:46 AM Hadley Wickham  wrote:
>> >
>> > Hi Michael,
>> >
>> > I can't help with S4, but I can help to make sure this isn't a problem
>> > with S7. What do you think of the current error message? Do you see
>> > anything obvious we could do to improve?
>> >
>> > library(S7)
>> >
>> > dbGetQuery <- new_generic("dbGetQuery", c("conn", "statement"))
>> > dbGetQuery(connection = NULL, query = NULL)
>> > #> Error: Can't find method for generic `dbGetQuery(conn, statement)`:
>> > #> - conn : MISSING
>> > #> - statement: MISSING
>> >
>> > Hadley
>> >
>> > On Wed, Aug 9, 2023 at 10:02 PM Michael Chirico via R-devel
>> >  wrote:
>> > >
>> > > I fielded a debugging request from a non-expert user today. At root
>> > > was running the following:
>> > >
>> > > dbGetQuery(connection = conn, query = query)
>> > >
>> > > The problem is that they've named the arguments incorrectly -- it
>> > > should have been [1]:
>> > >
>> > > dbGetQuery(conn = conn, statement = query)
>> > >
>> > > The problem is that the error message "looks" highly confusing to the
>> > > untrained eye:
>> > >
>> > > Error in (function (classes, fdef, mtable)  :   unable to find an
>> > > inherited method for function ‘dbGetQuery’ for signature ‘"missing",
>> > > "missing"’
>> > >
>> > > In retrospect, of course, this makes sense -- the mis-named arguments
>> > > are getting picked up by '...', leaving the required arguments
>> > > missing.
>> > >
>> > > But I was left wondering how we could help users right their own ship 
>> > > here.
>> > >
>> > > Would it help to mention the argument names? To include some code
>> > > checking for weird combinations of missing arguments? Any other
>> > > suggestions?
>> > >
>> > > Mike C
>> > >
>> > > [1] 
>> > > https://github.com/r-dbi/DBI/blob/97934c885749dd87a6beb10e8ccb6a5ebea3675e/R/dbGetQuery.R#L62-L64
>> > >
>> > > __
>> > > R-devel@r-project.org mailing list
>> > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>> >
>> >
>> > --
>> > http://hadley.nz
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Improving user-friendliness of S4 dispatch failure when mis-naming arguments?

2023-08-11 Thread Michael Chirico via R-devel
> I'm not entirely sure about the extra call. = FALSE

My thinking is the signature of the internal .InheritForDispatch() is
not likely to help anyone,
in fact having the opposite effect for beginners unsure how to use that info.

> Now I'd like to find an example that only uses packages with priority base | 
> Recommended

Sure, here are a few.

library(Matrix)
# searching for Matrix-owned generics
matrix_generics <- getGenerics(where = asNamespace("Matrix"))
matrix_generics@.Data[matrix_generics@package == "Matrix"]

# simple signature, one argument 'x'
symmpart()
# Error: unable to find an inherited method for function ‘symmpart’
for signature ‘x="missing"’

# more complicated signature, especially including ...
Cholesky(a = 1)
# Error: unable to find an inherited method for function ‘Cholesky’
for signature ‘A="missing"’
Cholesky(a = 1, perm = TRUE)
# Error: unable to find an inherited method for function ‘Cholesky’
for signature ‘A="missing"’
Cholesky(a = 1, perm = TRUE, IMult = 2)
# Error: unable to find an inherited method for function ‘Cholesky’
for signature ‘A="missing"’

---

'base' is a bit harder since stats4 just provides classes over stats
functions, so the missigness error comes from non-S4 code.

library(stats4)
coef()
# Error in coef.default() : argument "object" is missing, with no default

Defining our own generic:

setGeneric("BaseGeneric", \(a, ...) standardGeneric("BaseGeneric"))
BaseGeneric()
# Error: unable to find an inherited method for function ‘BaseGeneric’
for signature ‘a="missing"’

# getting multiple classes to show up requires setting the signature:
setMethod("BaseGeneric", signature(x = "double", y = "double"), \(x,
y, ...) x + y)
BaseGeneric(X = 1, Y = 2)
# Error: unable to find an inherited method for function ‘BaseGeneric’
for signature ‘x="missing", y="missing"’


On Fri, Aug 11, 2023 at 2:26 AM Martin Maechler
 wrote:
>
> >>>>> Michael Chirico via R-devel
> >>>>> on Thu, 10 Aug 2023 23:56:45 -0700 writes:
>
> > Here's a trivial patch that offers some improvement:
>
> > Index: src/library/methods/R/methodsTable.R
> > ===
> > --- src/library/methods/R/methodsTable.R (revision 84931)
> > +++ src/library/methods/R/methodsTable.R (working copy)
> > @@ -752,11 +752,12 @@
> > if(length(methods) == 1L)
> > return(methods[[1L]]) # the method
> > else if(length(methods) == 0L) {
> > -cnames <- paste0("\"", vapply(classes, as.character, ""), "\"",
> > +cnames <- paste0(head(fdef@signature, length(classes)), "=\"",
> > vapply(classes, as.character, ""), "\"",
> > collapse = ", ")
> > stop(gettextf("unable to find an inherited method for function %s
> > for signature %s",
> > sQuote(fdef@generic),
> > sQuote(cnames)),
> > + call. = FALSE,
> > domain = NA)
> > }
> > else
>
> > Here's the upshot for the example on DBI:
>
> > dbGetQuery(connection = conn, query = query)
> > Error: unable to find an inherited method for function ‘dbGetQuery’
> > for signature ‘conn="missing", statement="missing"’
>
> > I don't have any confidence about edge cases / robustness of this
> > patch for generic S4 use cases (make check-all seems fine),
>
> Good you checked, but you are right that that's not all enough to be sure:
>
> Checking error *messages* is not something we do often {not
> the least because you'd need to consider message translations
> and hence ensure you only check in case of English ...}.
>
> > but I don't suppose a full patch would be dramatically different from 
> the
> > above.
>
> I agree: The patch looks to make sense to me, too,
> while I'm not entirely sure about the extra   call. = FALSE
>  (which I of course understand you'd prefer for the above example)
>
> Now I'd like to find an example that only uses packages with priority
>  base | Recommended
>
> Martin
>
> --
> Martin Maechler
> ETH Zurich  and  R Core team
>
>
> > Mike C
>
> > On Thu, Aug 10, 2023 at 12:39 PM Gabriel Becker  
> wrote:
> >>
> >> I just want to add my 2 cents that I think it would be very useful and 
> beneficial to improve S4 to surface that information as well.
> >>
> >> Mo

[Rd] Unclear provenance of message from S4

2023-08-14 Thread Michael Chirico via R-devel
MRE to produce the message is the following:

setClass("Foo")
setOldClass("Bar")
setAs("Bar", "Foo", \(x) x)
# NOTE: arguments in definition for coerce changed from (x) to (from)

In an interactive setting, that may be fine, but I first encountered
this message in the install log of a package for which I am not the
author (nor do I have any context on the package, really).

As such it took me much longer than it otherwise would have to pick up
on the connection between 'coerce' and 'setAs' (as searching the
sources for 'coerce' yielded nothing). That the faulty argument is the
ubiquitous 'x' is also a major impediment.

It would be much better to surface the relationship to 'setAs' in this message.

I believe that logic has to live in setAs itself as the delegated
substituteFunctionArgs loses the proper context (unless we dare to
resort to sys.calls()).

Sharing here first per usual as I am not very familiar with S4, in
case this all is much clearer to someone with a sharper eye for S4.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] FR: valid_regex() to test string validity as a regular expression

2023-10-09 Thread Michael Chirico via R-devel
It will be useful to package authors trying to validate input which is
supposed to be a valid regular expression.

As near as I can tell, the only way we can do so now is to run any
regex function and check for the warning and/or condition to bubble
up:

valid_regex <- function(str) {
  stopifnot(is.character(str), length(str) == 1L)
  !inherits(tryCatch(grepl(str, ""), condition = identity), "condition")
}

That's pretty hefty/inscrutable for such a simple validation. I see a
variety of similar approaches in CRAN packages [1], all slightly
different. It would be good for R to expose a "canonical" way to run
this validation.

At root, the problem is that R does not expose the regex compilation
routines like 'tre_regcomp', so from the R side we have to resort to
hacky approaches.

Things get slightly complicated by encoding/useBytes modes
(tre_regwcomp, tre_regncomp, tre_regwncomp, tre_regcompb,
tre_regncompb; all in tre.h), but all are already present in other
regex routines, so this is doable.

Exposing a function to compile regular expressions is common in other
languages, e.g. Go [2], Python [3], JavaScript [4].

[1] 
https://github.com/search?q=lang%3AR+%2Fis%5Ba-zA-Z0-9._%5D*reg%5Ba-zA-Z0-9._%5D*ex.*%28%3C-%7C%3D%29%5Cs*function%2F+org%3Acran&type=code
[2] https://pkg.go.dev/regexp#Compile
[3] https://docs.python.org/3/library/re.html#re.compile
[4] 
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] FR: valid_regex() to test string validity as a regular expression

2023-10-10 Thread Michael Chirico via R-devel
> Grepping an empty string might work in many cases...

That's precisely why a base R offering is important, as a surer way of
validating in all cases. To be clear I am trying to directly access the
results of tre_regcomp().

> it is probably more portable to simply be prepared to propagate such
errors from the actual use on real inputs

That works best in self-contained calls -- foo(re) and we execute re inside
foo().

But the specific context where I found myself looking for a regex validator
is more complicated (https://github.com/r-lib/lintr/pull/2225). User
supplies a regular expression in a configuration file, only "later" is it
actually supplied to grepl().

Till now, we've done your suggestion -- just surface the regex error at run
time. But our goal is to make it friendlier and fail earlier at "compile
time" as the config is loaded, "long" before any regex is actually executed.

At a bare minimum this is a good place to return a classed warning (say
invalid_regex_warning) to allow finer control than tryCatch(condition=).

On Mon, Oct 9, 2023, 11:30 PM Tomas Kalibera 
wrote:

>
> On 10/10/23 01:57, Michael Chirico via R-devel wrote:
>
> It will be useful to package authors trying to validate input which is
> supposed to be a valid regular expression.
>
> As near as I can tell, the only way we can do so now is to run any
> regex function and check for the warning and/or condition to bubble
> up:
>
> valid_regex <- function(str) {
>   stopifnot(is.character(str), length(str) == 1L)
>   !inherits(tryCatch(grepl(str, ""), condition = identity), "condition")
> }
>
> That's pretty hefty/inscrutable for such a simple validation. I see a
> variety of similar approaches in CRAN packages [1], all slightly
> different. It would be good for R to expose a "canonical" way to run
> this validation.
>
> At root, the problem is that R does not expose the regex compilation
> routines like 'tre_regcomp', so from the R side we have to resort to
> hacky approaches.
>
> Hi Michael,
>
> I don't think you need compilation functions for that. If a regular
> expression is found invalid by a specific third party library R uses, the
> library should return and error to R and R should return an error to you,
> and you should probably propagate that to your users. Grepping an empty
> string might work in many cases as a test, but it is probably more portable
> to simply be prepared to propagate such errors from the actual use on real
> inputs. In theory, there could be some optimization for a particular case,
> the checking may not be the same - but that is the same say for compilation
> and checking.
>
> Things get slightly complicated by encoding/useBytes modes
> (tre_regwcomp, tre_regncomp, tre_regwncomp, tre_regcompb,
> tre_regncompb; all in tre.h), but all are already present in other
> regex routines, so this is doable.
>
> Re encodings, simply R strings should be valid in their encoding. This is
> not just for regular expressions but also for anything else. You shouldn't
> assume that R can handle invalid strings in any reasonable way. Definitely
> you shouldn't try adding invalid strings in tests - behavior with invalid
> strings is unspecified. To test whether a string is valid, there is
> validEnc() (or validUTF8()). But, again, it is probably safest to propagate
> errors from the regular expression R functions (in case the checks differ,
> particularly for non-UTF-8), also, duplicating the encoding checks can be a
> non-trivial overhead.
>
> If there was a strong need to have an automated way to somehow classify
> specifically errors from the regex libraries, perhaps R could attach some
> classes to them when the library tells.
>
> Tomas
>
> Exposing a function to compile regular expressions is common in other
> languages, e.g. Go [2], Python [3], JavaScript [4].
>
> [1] 
> https://github.com/search?q=lang%3AR+%2Fis%5Ba-zA-Z0-9._%5D*reg%5Ba-zA-Z0-9._%5D*ex.*%28%3C-%7C%3D%29%5Cs*function%2F+org%3Acran&type=code
> [2] https://pkg.go.dev/regexp#Compile
> [3] https://docs.python.org/3/library/re.html#re.compile
> [4] 
> https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
>
> __r-de...@r-project.org mailing 
> listhttps://stat.ethz.ch/mailman/listinfo/r-devel
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] FR: valid_regex() to test string validity as a regular expression

2023-10-11 Thread Michael Chirico via R-devel
Great to know this exists in package space!

Of course, using re2 validation for a regex to be executed with TRE
(via grep*) is just begging for trouble (e.g. [1] suggests re2 is
closer to PCRE, [2] says "mostly" PCRE compatible). And overhauling
everything to use re2 just for regex validation is hardly practical.

[1] https://laurikari.net/tre/google-releases-the-re2-library/
[2] https://hackerboss.com/is-your-regex-matcher-up-to-snuff/

On Wed, Oct 11, 2023 at 4:02 PM Toby Hocking  wrote:
>
> Hi Michael, it sounds like you don't want to use a CRAN package for
> this, but you may try re2, see below.
>
> > grepl("(invalid","subject",perl=TRUE)
> Error in grepl("(invalid", "subject", perl = TRUE) :
>   invalid regular expression '(invalid'
> In addition: Warning message:
> In grepl("(invalid", "subject", perl = TRUE) :
>   PCRE pattern compilation error
> 'missing closing parenthesis'
> at ''
>
> > grepl("(invalid","subject",perl=FALSE)
> Error in grepl("(invalid", "subject", perl = FALSE) :
>   invalid regular expression '(invalid', reason 'Missing ')''
> In addition: Warning message:
> In grepl("(invalid", "subject", perl = FALSE) :
>   TRE pattern compilation error 'Missing ')''
>
> > re2::re2_regexp("(invalid")
> Error: missing ): (invalid
>
> On Tue, Oct 10, 2023 at 7:57 AM Michael Chirico via R-devel
>  wrote:
> >
> > > Grepping an empty string might work in many cases...
> >
> > That's precisely why a base R offering is important, as a surer way of
> > validating in all cases. To be clear I am trying to directly access the
> > results of tre_regcomp().
> >
> > > it is probably more portable to simply be prepared to propagate such
> > errors from the actual use on real inputs
> >
> > That works best in self-contained calls -- foo(re) and we execute re inside
> > foo().
> >
> > But the specific context where I found myself looking for a regex validator
> > is more complicated (https://github.com/r-lib/lintr/pull/2225). User
> > supplies a regular expression in a configuration file, only "later" is it
> > actually supplied to grepl().
> >
> > Till now, we've done your suggestion -- just surface the regex error at run
> > time. But our goal is to make it friendlier and fail earlier at "compile
> > time" as the config is loaded, "long" before any regex is actually executed.
> >
> > At a bare minimum this is a good place to return a classed warning (say
> > invalid_regex_warning) to allow finer control than tryCatch(condition=).
> >
> > On Mon, Oct 9, 2023, 11:30 PM Tomas Kalibera 
> > wrote:
> >
> > >
> > > On 10/10/23 01:57, Michael Chirico via R-devel wrote:
> > >
> > > It will be useful to package authors trying to validate input which is
> > > supposed to be a valid regular expression.
> > >
> > > As near as I can tell, the only way we can do so now is to run any
> > > regex function and check for the warning and/or condition to bubble
> > > up:
> > >
> > > valid_regex <- function(str) {
> > >   stopifnot(is.character(str), length(str) == 1L)
> > >   !inherits(tryCatch(grepl(str, ""), condition = identity), "condition")
> > > }
> > >
> > > That's pretty hefty/inscrutable for such a simple validation. I see a
> > > variety of similar approaches in CRAN packages [1], all slightly
> > > different. It would be good for R to expose a "canonical" way to run
> > > this validation.
> > >
> > > At root, the problem is that R does not expose the regex compilation
> > > routines like 'tre_regcomp', so from the R side we have to resort to
> > > hacky approaches.
> > >
> > > Hi Michael,
> > >
> > > I don't think you need compilation functions for that. If a regular
> > > expression is found invalid by a specific third party library R uses, the
> > > library should return and error to R and R should return an error to you,
> > > and you should probably propagate that to your users. Grepping an empty
> > > string might work in many cases as a test, but it is probably more 
> > > portable
> > > to simply be prepared to propagate such errors from the actual use on real
> > > inputs. In theory, there could be some optimization for a parti