[Rd] Set operation generics
Mikael Jagan
j@g@nmn2 @end|ng |rom gm@||@com
Fri Dec 19 21:13:07 CET 2025
On 2025-12-19 12:41 pm, Josiah Parry wrote:
> Thanks, Mikael!
>
> I don't think that adding methods for S3 classes for [ and c() is
> sufficient, unfortunately. I think the current behavior is a beautiful
> default implementation, of course.
>
> For some classes subsetting or combining may not be an operation that makes
> sense—keeping with the nb class example, what does it mean to combine a
> spatial weights matrix between two separate study regions? Or what happens
> when you subset a spatial weights matrix and it contains locations as
> neighbors that no longer exist in it?
>
> Additionally, this makes the assumption that the base implementations are
> the appropriate way to perform these operations for all S3 classes—they may
> not be! I also wonder about any assumptions one might make about the union
> or intersection of attributes of the provided S3 classes as well.
>
What you are describing is exactly the distinction between vector-like and
non-vector-like classes that I was trying to make { and that help("union")
tries to make where it says
The set operations are intended for "same-kind" "vector-like"
objects containing sequences of items. .... }
We agree that 'nb' is one example of a class that, despite having well defined
set operations, does not satisfy the requirement of "vector-likeness".
My main point was that your GitHub searches don't provide convincing (to me)
evidence of the proposal's broader benefit, because they don't measure the
number of *classes* out there that are like 'nb' in the above sense.
Mikael
>
>
>
> On Thu, Dec 18, 2025 at 7:50 AM Mikael Jagan <jaganmn2 using gmail.com> wrote:
>
>>> Date: Wed, 17 Dec 2025 11:50:21 -0800
>>> From: Josiah Parry<josiah.parry using gmail.com>
>>>
>>> I wanted to write to understand what limitations there may be with making
>>> set operations in base S3 generic functions. Are there any technical
>>> limitations as to why this wouldn't be possible?
>>>
>>
>> The set ops {intersect, union, setdiff, setequal} and %in% and %notin% are
>> all
>> generic-like by virtue of composing generic functions for vector-like
>> classes.
>> If you have a vector-like class and you define (as needed) methods for '[',
>> 'c', 'mtfrm', 'names<-', and 'unique', then the set ops work automatically
>> and
>> correctly. The built-in classes 'Date', 'POSIXct', 'POSIXlt', 'difftime',
>> and
>> 'factor' provide a good model here.
>>
>> S3 generic set ops would only really support those non-vector-like classes
>> for
>> which set ops happen to have a meaningful definition: 'nb' is a good
>> example,
>> but are there many others?
>>
>> A benefit of having a minimal set of generic functions in base (and
>> composing
>> them to form a larger set of generic-like functions) is that it limits
>> growth
>> of the base namespace. Every new generic function base::generic requires a
>> corresponding default method base::generic.default.
>>
>>> In writing a reply in R-Sig-Geo (1) today, I was reminded that `spdep`'s
>>> set operations are not exported S3 methods—e.g. must use
>>> spdep::union.nb()—because there is no generic declared in `base`.
>>>
>>> I think the R ecosystem would benefit greatly from generics declared in
>>> base for these methods. For example, the `generics` (2) package was
>>> published in 2018 including S3 generics for set operations masking base.
>>> `generics` has 189 reverse imports, I suspect quite a few of them are for
>>> set operations.
>>>
>>> Generics GitHub usage (duplicates ofc from forks)
>>>
>>> - 353 results for importFrom(generics, union) (3)
>>> - 361 results for importFrom(generics, intersect) (4)
>>> - 355 results for importFrom(generics,setdiff) (5)
>>>
>>> There are also a number of manual implementations of an S3 generic for
>> set
>>> ops that mask base. See the following search GitHub results
>>>
>>> - 249 results for UseMethod("union") (6)
>>> - 208 results for UseMethod("intersect") (7)
>>> - 199 results for UseMethod("setdiff") (8)
>>>
>>
>> My guess is that in most of these examples masking the base set ops would
>> not
>> be necessary if some vector-like class were implemented more rigorously,
>> i.e.,
>> with methods for '[', 'c', etc.
>>
>> Mikael
>>
>>>
>>> references :
>>> 1.https://stat.ethz.ch/pipermail/r-sig-geo/2025-December/029582.html
>>> 2.https://cran.r-project.org/src/contrib/Archive/generics
>>> 3.
>> https://github.com/search?q=importFrom%28generics%2Cunion%29+&type=code
>>> 4.
>>>
>> https://github.com/search?q=importFrom%28generics%2Cintersect%29+&type=code
>>> 5.
>> https://github.com/search?q=importFrom%28generics%2Csetdiff%29+&type=code
>>> 6.
>>>
>> https://github.com/search?q=UseMethod%28%22union%22%29+language%3AR&type=code
>>> 7.
>>>
>> https://github.com/search?q=UseMethod%28%22intersect%22%29+language%3AR&type=code
>>> 8.
>>>
>> https://github.com/search?q=UseMethod%28%22setdiff%22%29+language%3AR&type=code
>>>
>>> [[alternative HTML version deleted]]
>>
>>
>
More information about the R-devel
mailing list