Hi Mike,

On Fri, Jun 6, 2025 at 1:59 PM Michael Chirico
<michaelchiri...@gmail.com> wrote:
>
> There is a big difference in how to think of '...' for non-generic
> functions like data.frame() vs. S3 generics.
>
> In the former, it means "any number of inputs" [e.g. columns]; in the
> latter, it means "any number of inputs [think c()], as well as any
> arguments that might be interpreted by class implementations".
>
> Understanding the difference for a given generic can require carefully
> reading lots of documentation. print(<generic>), which is useful for
> so many other contexts, can be a dead end.
>
> One idea is to extend the print() method to suggest to the reader
> which other arguments are available (among registered generics). Often
> ?<generic> will include the most common implementation, but not always
> so.
>
> For rbind (in a --vanilla session), we currently have one method,
> rbind.data.frame, that offers three arguments not present in the
> generic: make.row.names, stringsAsFactors, and factor.exclude. The
> proposal would be to mention this in the print(rbind) output somehow,
> e.g.
>
> > print(rbind)
> function (..., deparse.level = 1)
> .Internal(rbind(deparse.level, ...))
> <bytecode: 0x73d4fd824e20>
> <environment: namespace:base>
>
> +Other arguments implemented by methods
> +  factor.exclude: rbind.data.frame
> +  make.row.names: rbind.data.frame
> +  stringsAsFactors: rbind.data.frame
>
> I suggest grouping by argument, not generic, although something like
> this could be OK too:
>
> +Signatures of other methods
> +  rbind.data.frame(..., deparse.level = 1, make.row.names = TRUE,
> stringsAsFactors = FALSE,
> +      factor.exclude = TRUE)
>
> Where it gets more interesting is when there are many methods, e.g.
> for as.data.frame (again, in a --vanilla session):
>
> > print(as.data.frame)
> function (x, row.names = NULL, optional = FALSE, ...)
> {
>     if (is.null(x))
>         return(as.data.frame(list()))
>     UseMethod("as.data.frame")
> }
> <bytecode: 0x73d4fc1e70d0>
> <environment: namespace:base>
>
> +Other arguments implemented by methods
> +  base: as.data.frame.table
> +  check.names: as.data.frame.list
> +  col.names: as.data.frame.list
> +  cut.names: as.data.frame.list
> +  fix.empty.names: as.data.frame.list
> +  make.names: as.data.frame.matrix, as.data.frame.model.matrix
> +  new.names: as.data.frame.list
> +  nm: as.data.frame.bibentry, as.data.frame.complex, as.data.frame.Date,
> +    as.data.frame.difftime, as.data.frame.factor, as.data.frame.integer,
> +    as.data.frame.logical, as.data.frame.noquote, as.data.frame.numeric,
> +    as.data.frame.numeric_version, as.data.frame.ordered,
> +    as.data.frame.person, as.data.frame.POSIXct, as.data.frame.raw
> +  responseName: as.data.frame.table
> +  sep: as.data.frame.table
> +  stringsAsFactors: as.data.frame.character, as.data.frame.list,
> +    as.data.frame.matrix, as.data.frame.table
>
> Or
>
> +Signatures of other methods
> +  as.data.frame.aovproj(x, ...)
> +  as.data.frame.array(x, row.names = NULL, optional = FALSE, ...)
> +  as.data.frame.AsIs(x, row.names = NULL, optional = FALSE, ...)
> +  as.data.frame.bibentry(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.character(x, ..., stringsAsFactors = FALSE)
> +  as.data.frame.citation(x, row.names = NULL, optional = FALSE, ...)
> +  as.data.frame.complex(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.data.frame(x, row.names = NULL, ...)
> +  as.data.frame.Date(x, row.names = NULL, optional = FALSE, ..., nm =
> deparse1(substitute(x)))
> +  as.data.frame.default(x, ...)
> +  as.data.frame.difftime(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.factor(x, row.names = NULL, optional = FALSE, ..., nm
> = deparse1(substitute(x)))
> +  as.data.frame.ftable(x, row.names = NULL, optional = FALSE, ...)
> +  as.data.frame.integer(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.list(x, row.names = NULL, optional = FALSE, ...,
> cut.names = FALSE,
> +      col.names = names(x), fix.empty.names = TRUE, new.names =
> !missing(col.names),
> +      check.names = !optional, stringsAsFactors = FALSE)
> +  as.data.frame.logical(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.logLik(x, ...)
> +  as.data.frame.matrix(x, row.names = NULL, optional = FALSE,
> make.names = TRUE,
> +      ..., stringsAsFactors = FALSE)
> +  as.data.frame.model.matrix(x, row.names = NULL, optional = FALSE,
> make.names = TRUE,
> +      ...)
> +  as.data.frame.noquote(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.numeric(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.numeric_version(x, row.names = NULL, optional =
> FALSE, ..., nm = deparse1(substitute(x)))
> +  as.data.frame.ordered(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.person(x, row.names = NULL, optional = FALSE, ..., nm
> = deparse1(substitute(x)))
> +  as.data.frame.POSIXct(x, row.names = NULL, optional = FALSE, ...,
> nm = deparse1(substitute(x)))
> +  as.data.frame.POSIXlt(x, row.names = NULL, optional = FALSE, ...)
> +  as.data.frame.raw(x, row.names = NULL, optional = FALSE, ..., nm =
> deparse1(substitute(x)))
> +  as.data.frame.table(x, row.names = NULL, ..., responseName =
> "Freq", stringsAsFactors = TRUE,
> +      sep = "", base = list(LETTERS))
> +  as.data.frame.ts(x, ...)
>
> Obviously that's a bit more cluttered, but as.data.frame() should be a
> pretty unusual case. It also highlights better the differences in the
> two approaches: the former economizes on space and focuses on what
> sorts of arguments are available; the latter shows the defaults, does
> not hide the arguments shared with the generic, and will always
> produce as many lines as there are methods.
>
> There are other edge cases to think through (multiple registrations,
> interactions with S4, primitives, ...), but I want to first check with
> the list if this looks workable & valuable enough to pursue.
>
I like and appreciate the intent behind your suggestion, though I
don't like all the extra output from printing the generic. I want to
look at the function body when I print it. And as you show, it can
output a lot of information you're probably not interested in.

What about adding the number of methods to printed output for
generics, and a suggestion to use `methods(some_generic)` to get a
list of them? Then you can use help(some_method) or args(some_method)
to get more information about the specific method(s) you're interested
in.

Best,
Josh

> Mike C
>
> ----
>
> Code that helped with the above:
>
> f = as.data.frame
> # NB: methods() and getAnywhere() require {utils}
> m = methods(f)
> generic_args = names(formals(f))
> f_methods = lapply(m, \(fn) getAnywhere(fn)$objs[[1L]])
> names(f_methods) = m
> new_args = sapply(f_methods, \(g) setdiff(names(formals(g)), generic_args))
> with( # group by argument name
>   data.frame(method = rep(names(new_args), lengths(new_args)), arg =
> unlist(new_args), row.names=NULL),
>   {tbl = tapply(method, arg, toString); writeLines(paste0(names(tbl),
> ": ", tbl))}
> )
> signatures=sapply(f_methods, \(g) paste(head(format(args(g)), -1),
> collapse="\n"))
> writeLines(paste0(names(signatures), gsub("^\\s*function\\s*", "", 
> signatures)))
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to