[Rd] An iteration protocol
Lionel Henry
||one| @end|ng |rom po@|t@co
Tue Aug 12 11:20:39 CEST 2025
Clever! If going for non-local returns, probably best for ergonomics to pass in
a closure (see e.g. `callCC()`). If only to avoid accidental jumps while
debugging.
But... do we need more lazy evaluation tricks in the language or fewer? It's
probably more idiomatic to express non-local returns with condition signals
like `stopIteration()`.
There's something to be said for explicit and simple control flow though, via
handling of returned values.
> Note that it is trivial to create a unique sentinel value -- any newly
> created closure (i.e. function() NULL) will do, as it will only
> compare identical() with itself.
Until you try that in the global env right? Then the risk of collision slightly
increases. Unless you make your closure more unique via `body()`, but then might
as well use a conventional sentinel.
Best,
Lionel
On Tue, Aug 12, 2025 at 1:45 AM Peter Meilstrup
<peter.meilstrup using gmail.com> wrote:
>
> Passing the sentinel value as an argument to the iteration method is
> the approach taken in my package `iterors` on CRAN. If the sentinel
> value argument is evaluated lazily, this lets you pass calls to things
> like 'stop', 'break' or 'return,' which will be called to signal end
> of iteration. This makes for some nice compact and performant
> iteration idioms:
>
> iter <- as.iteror(obj)
> total <- 0
> repeat {total <- total + nextOr(iter, break)}
>
> Note that iteror is just a closure with one optional argument and a
> class attribute, so you can skip using s3 nextOr method and call it
> directly:
>
> nextElem <- as.iteror(obj)
> repeat {total <- total + nextElem(break)}
>
> For backward compatibility with the iterators package, the default
> sentinel value for iterors is `stop("StopIteration")`.
>
> Note that it is trivial to create a unique sentinel value -- any newly
> created closure (i.e. function() NULL) will do, as it will only
> compare identical() with itself.
>
> sigil <- \() NULL
> next <- as.iteror(obj)
> while (!identical(item <-next(sigil), sigil)) {
> doStuff(item)
> }
>
> Peter Meilstrup
>
> On Mon, Aug 11, 2025 at 5:56 PM Lionel Henry via R-devel
> <r-devel using r-project.org> wrote:
> >
> > Hello,
> >
> > A couple of comments:
> >
> > - Regarding the closure + sentinel approach, also implemented in coro
> > (https://github.com/r-lib/coro/blob/main/R/iterator.R), it's more
> > robust for the
> > sentinel to always be a temporary value. If you store the sentinel
> > in a list or
> > a namespace, it might inadvertently close iterators when iterating over that
> > collection. That's why the coro sentinel is created with `coro::exhausted()`
> > rather than exported from the namespace as a constant object. The sentinel can
> > be equivalently created with `as.symbol(".__exhausted__.")`, the main thing to
> > ensure robustness is to avoid storing it and always create it from scratch.
> >
> > The approach of passing the sentinel by argument (which I see in the example
> > in your mail but not in the linked documentation of approach 3) also
> > works if the
> > iterator loop passes a unique sentinel. Having a default of `NULL` makes it
> > likely to get unexpected exhaustion of iterators when a sentinel is not passed
> > in though.
> >
> > - It's very useful to _close_ iterators for resource cleanup. It's the
> > responsibility of an iterator loop (e.g. `for` but could be other custom tools
> > invoking the iterator) to close them. See https://github.com/r-lib/coro/pull/58
> > for an interesting application of iterator closing, allowing robust support of
> > `on.exit()` expressions in coro generators.
> >
> > To implement iterator closing with the closure approach, an iterator may
> > optionally take a `close` argument. A `true` value is passed on exit,
> > instructing the iterator to clean up resources.
> >
> > Best,
> > Lionel
> >
> > On Mon, Aug 11, 2025 at 3:24 PM Tomasz Kalinowski <kalinowskit using gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > A while back, Hadley and I explored what an iteration protocol for R
> > > might look like. We worked through motivations, design choices, and edge
> > > cases, which we documented here:
> > > https://github.com/t-kalinowski/r-iterator-ideas
> > >
> > > At the end of this process, I put together a patch to R (with tests) and
> > > would like to invite feedback from R Core and the broader community:
> > > https://github.com/r-devel/r-svn/pull/130/files?diff=unified&w=1
> > >
> > > In summary, the overall design is a minimal patch. It introduces no
> > > breaking changes and essentially no new overhead. There are two parts.
> > >
> > > 1. Add a new `as.iterable()` S3 generic, with a default identity
> > > method. This provides a user-extensible mechanism for selectively
> > > changing the iteration behavior for some object types passed to
> > > `for`. `as.iterable()` methods are expected to return anything that
> > > `for` can handle directly, namely, vectors or pairlists, or (new) a
> > > closure.
> > >
> > > 2. `for` gains the ability to accept a closure for the iterable
> > > argument. A closure is called repeatedly for each loop iteration
> > > until the closure returns an `exhausted` sentinel value, which it
> > > received as an input argument.
> > >
> > > Here is a small example of using the iteration protocol to implement a
> > > sequence of random samples:
> > >
> > > ``` r
> > > SampleSequence <- function(n) {
> > > i <- 0
> > > function(done = NULL) {
> > > if (i >= n) {
> > > return(done)
> > > }
> > > i <<- i + 1
> > > runif(1)
> > > }
> > > }
> > >
> > > for(sample in SampleSequence(2)) {
> > > print(sample)
> > > }
> > >
> > > # [1] 0.7677586
> > > # [1] 0.355592
> > > ```
> > >
> > > Best,
> > > Tomasz
> > >
> > > ______________________________________________
> > > R-devel using r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list