[Rd] Set operation generics
Mikael Jagan
j@g@nmn2 @end|ng |rom gm@||@com
Thu Dec 18 16:50:10 CET 2025
> Date: Wed, 17 Dec 2025 11:50:21 -0800
> From: Josiah Parry<josiah.parry using gmail.com>
>
> I wanted to write to understand what limitations there may be with making
> set operations in base S3 generic functions. Are there any technical
> limitations as to why this wouldn't be possible?
>
The set ops {intersect, union, setdiff, setequal} and %in% and %notin% are all
generic-like by virtue of composing generic functions for vector-like classes.
If you have a vector-like class and you define (as needed) methods for '[',
'c', 'mtfrm', 'names<-', and 'unique', then the set ops work automatically and
correctly. The built-in classes 'Date', 'POSIXct', 'POSIXlt', 'difftime', and
'factor' provide a good model here.
S3 generic set ops would only really support those non-vector-like classes for
which set ops happen to have a meaningful definition: 'nb' is a good example,
but are there many others?
A benefit of having a minimal set of generic functions in base (and composing
them to form a larger set of generic-like functions) is that it limits growth
of the base namespace. Every new generic function base::generic requires a
corresponding default method base::generic.default.
> In writing a reply in R-Sig-Geo (1) today, I was reminded that `spdep`'s
> set operations are not exported S3 methods—e.g. must use
> spdep::union.nb()—because there is no generic declared in `base`.
>
> I think the R ecosystem would benefit greatly from generics declared in
> base for these methods. For example, the `generics` (2) package was
> published in 2018 including S3 generics for set operations masking base.
> `generics` has 189 reverse imports, I suspect quite a few of them are for
> set operations.
>
> Generics GitHub usage (duplicates ofc from forks)
>
> - 353 results for importFrom(generics, union) (3)
> - 361 results for importFrom(generics, intersect) (4)
> - 355 results for importFrom(generics,setdiff) (5)
>
> There are also a number of manual implementations of an S3 generic for set
> ops that mask base. See the following search GitHub results
>
> - 249 results for UseMethod("union") (6)
> - 208 results for UseMethod("intersect") (7)
> - 199 results for UseMethod("setdiff") (8)
>
My guess is that in most of these examples masking the base set ops would not
be necessary if some vector-like class were implemented more rigorously, i.e.,
with methods for '[', 'c', etc.
Mikael
>
> references :
> 1.https://stat.ethz.ch/pipermail/r-sig-geo/2025-December/029582.html
> 2.https://cran.r-project.org/src/contrib/Archive/generics
> 3.https://github.com/search?q=importFrom%28generics%2Cunion%29+&type=code
> 4.
> https://github.com/search?q=importFrom%28generics%2Cintersect%29+&type=code
> 5.https://github.com/search?q=importFrom%28generics%2Csetdiff%29+&type=code
> 6.
> https://github.com/search?q=UseMethod%28%22union%22%29+language%3AR&type=code
> 7.
> https://github.com/search?q=UseMethod%28%22intersect%22%29+language%3AR&type=code
> 8.
> https://github.com/search?q=UseMethod%28%22setdiff%22%29+language%3AR&type=code
>
> [[alternative HTML version deleted]]
More information about the R-devel
mailing list