On 2025-12-19 12:41 pm, Josiah Parry wrote:
Thanks, Mikael!

I don't think that adding methods for S3 classes for [ and c() is
sufficient, unfortunately. I think the current behavior is a beautiful
default implementation, of course.

For some classes subsetting or combining may not be an operation that makes
sense—keeping with the nb class example, what does it mean to combine a
spatial weights matrix between two separate study regions? Or what happens
when you subset a spatial weights matrix and it contains locations as
neighbors that no longer exist in it?

Additionally, this makes the assumption that the base implementations are
the appropriate way to perform these operations for all S3 classes—they may
not be! I also wonder about any assumptions one might make about the union
or intersection of attributes of the provided S3 classes as well.


What you are describing is exactly the distinction between vector-like and
non-vector-like classes that I was trying to make { and that help("union")
tries to make where it says

    The set operations are intended for "same-kind" "vector-like"
    objects containing sequences of items.  .... }

We agree that 'nb' is one example of a class that, despite having well defined
set operations, does not satisfy the requirement of "vector-likeness".

My main point was that your GitHub searches don't provide convincing (to me)
evidence of the proposal's broader benefit, because they don't measure the
number of *classes* out there that are like 'nb' in the above sense.

Mikael




On Thu, Dec 18, 2025 at 7:50 AM Mikael Jagan <[email protected]> wrote:

Date: Wed, 17 Dec 2025 11:50:21 -0800
From: Josiah Parry<[email protected]>

I wanted to write to understand what limitations there may be with making
set operations in base S3 generic functions. Are there any technical
limitations as to why this wouldn't be possible?


The set ops {intersect, union, setdiff, setequal} and %in% and %notin% are
all
generic-like by virtue of composing generic functions for vector-like
classes.
If you have a vector-like class and you define (as needed) methods for '[',
'c', 'mtfrm', 'names<-', and 'unique', then the set ops work automatically
and
correctly.  The built-in classes 'Date', 'POSIXct', 'POSIXlt', 'difftime',
and
'factor' provide a good model here.

S3 generic set ops would only really support those non-vector-like classes
for
which set ops happen to have a meaningful definition: 'nb' is a good
example,
but are there many others?

A benefit of having a minimal set of generic functions in base (and
composing
them to form a larger set of generic-like functions) is that it limits
growth
of the base namespace.  Every new generic function base::generic requires a
corresponding default method base::generic.default.

In writing a reply in R-Sig-Geo (1)  today, I was reminded that `spdep`'s
set operations are not exported S3 methods—e.g. must use
spdep::union.nb()—because there is no generic declared in `base`.

I think the R ecosystem would benefit greatly from generics declared in
base for these methods. For example, the `generics` (2) package was
published in 2018 including S3 generics for set operations masking base.
`generics` has 189 reverse imports, I suspect quite a few of them are for
set operations.

Generics GitHub usage (duplicates ofc from forks)

- 353 results for importFrom(generics, union) (3)
- 361 results for importFrom(generics, intersect) (4)
- 355 results for importFrom(generics,setdiff) (5)

There are also a number of manual implementations of an S3 generic for
set
ops that mask base. See the following search GitHub results

- 249 results for UseMethod("union") (6)
- 208 results for UseMethod("intersect") (7)
- 199 results for UseMethod("setdiff") (8)


My guess is that in most of these examples masking the base set ops would
not
be necessary if some vector-like class were implemented more rigorously,
i.e.,
with methods for '[', 'c', etc.

Mikael


references :
1.https://stat.ethz.ch/pipermail/r-sig-geo/2025-December/029582.html
2.https://cran.r-project.org/src/contrib/Archive/generics
3.
https://github.com/search?q=importFrom%28generics%2Cunion%29+&type=code
4.

https://github.com/search?q=importFrom%28generics%2Cintersect%29+&type=code
5.
https://github.com/search?q=importFrom%28generics%2Csetdiff%29+&type=code
6.

https://github.com/search?q=UseMethod%28%22union%22%29+language%3AR&type=code
7.

https://github.com/search?q=UseMethod%28%22intersect%22%29+language%3AR&type=code
8.

https://github.com/search?q=UseMethod%28%22setdiff%22%29+language%3AR&type=code

       [[alternative HTML version deleted]]




______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to