[Rd] Request for comment: namespace resolution in terms(<formula>, specials=) [<pkg>::<name>, etc.]
peter dalgaard
pd@|gd @end|ng |rom gm@||@com
Tue Apr 15 10:17:57 CEST 2025
I don't seem to have the original post (not in spamfilter either). But generically, I think namespacing specials in formulas is just a Bad Idea. They are syntactic constructs, specifically _not_ function calls, so people are stumbling over formally protecting them from a non-existing scoping issue, then having to undo that for the actual use.
It all came about by someone (I have forgotten the details) having a corporate coding standard mandating namespaces on all function calls and falling over things like strata() in the survival package. Then package author(s) chose to comply rather than explain...
-pd
> On 14 Apr 2025, at 23.49, Ben Bolker <bbolker using gmail.com> wrote:
>
> I don't have any concerns about these changes, don't see any need to preserve the old behaviour.
>
> In lme4 and glmmTMB (and now broken out into a separate `reformulas` package, I do this the hard way, walking down the parse trees of formula objects and looking for specials, and not using the functionality here.
>
> Mikael showed how I could use the *new* functionality instead:
>
> https://github.com/bbolker/reformulas/issues/4
>
> but honestly if I were going to change things in `reformulas` it would be in the direction of streamlining and refactoring, not changing the basic approach.
>
> cheers
> Ben Bolker
>
>
> On 2025-04-14 5:43 p.m., Mikael Jagan wrote:
>> [CC: maintainers of R packages survival, mgcv, lme4, RItools]
>> Dear R-devel subscribers,
>> If you have never used stats:::terms.formula or its 'specials' argument,
>> then feel free to stop reading or otherwise review help("terms.formula")
>> and help("terms.object").
>> Folks may have noticed a recent change in R-devel:
>> $ svn log -v -r 88066
>> ------------------------------------------------------------------------
>> r88066 | maechler | 2025-03-28 17:04:27 -0400 (Fri, 28 Mar 2025) | 1 line
>> Changed paths:
>> M /trunk/doc/NEWS.Rd
>> M /trunk/src/library/stats/src/model.c
>> M /trunk/tests/reg-tests-1e.R
>> terms(<formula>, specials = "<non-syntactic>") now works
>> ------------------------------------------------------------------------
>> intended to resolve Bug 18568
>> https://bugs.r-project.org/show_bug.cgi?id=18568
>> which pointed out the following undesirable behaviour in R-release:
>> > attr(terms(~x1 + s (x2, f) + s (x3, g), specials = "s"), "specials")
>> $s
>> [1] 2 3
>> > attr(terms(~x1 + `|`(x2, f) + `|`(x3, g), specials = "|"), "specials")
>> $`|`
>> NULL
>> namely that non-syntactic names like "|" were not supported. Unfortunately,
>> the patch (r88066) broke one package on CRAN, RItools, which relied on the
>> following
>> > attr(terms(~x1 + mgcv::s (x2, f), specials = "mgcv::s"), "specials")
>> $`mgcv::s`
>> [1] 2
>> > attr(terms(~x1 + `mgcv::s`(x2, f), specials = "mgcv::s"), "specials")
>> $`mgcv::s`
>> NULL
>> whereas in R-devel we see
>> > attr(terms(~x1 + mgcv::s (x2, f), specials = "mgcv::s"), "specials")
>> $`mgcv::s`
>> NULL
>> > attr(terms(~x1 + `mgcv::s`(x2, f), specials = "mgcv::s"), "specials")
>> $`mgcv::s`
>> [1] 2
>> A strict interpretation of 'specials' as a list of *name*s of functions would
>> suggest that the old behaviour was "wrong" (and accidental, predating package
>> namespaces altogether) and that the new behaviour is "right". After all,
>> `mgcv::s` (with backticks) is a name (of type "symbol", class "name") whereas
>> mgcv::s (without backticks) is a call (of type "language", class "call").
>> Martin and I are requesting comments from the community, especially R-core
>> members and package authors who use 'specials', on the following:
>> 1. Should the previous (long standing but undocumented, likely rarely used)
>> behaviour be preserved going forward?
>> 2. If we pursue a more *robust* implementation of namespace resolution by
>> stats:::terms.formula, not relying on details of how non- syntactic names
>> are deparsed, then what should that look like?
>> (I say "likely rarely used" because stats:::terms.formula is called primarily by
>> package *authors* to parse formulas of package *users*. Only a subset of those
>> packages will set 'specials', only a subset of *those* packages will set
>> specials="<pkg>::<name>", and only one such package is known to be broken due
>> to r88066.)
>> Relevant to (2) is an earlier thread
>> https://stat.ethz.ch/pipermail/r-devel/2025-March/083906.html
>> in which I proposed that we make use of an optional 'package' attribute of
>> 'specials', so that
>> specials = structure(c("s", "s"), package = c("", "mgcv"))
>> would match calls s(...) and mgcv::s(...) separately. This attribute would be
>> preserved by the 'specials' component of the 'terms' object, e.g.,
>> > attr(terms(~x1 + s(x2, f) + mgcv::s(x3, g),
>> + specials = structure(c("s", "s"), package = c("", "mgcv"))),
>> + "specials")
>> $s
>> [1] 2
>> $s
>> [1] 3
>> attr(,"package")
>> [1] "" "mgcv"
>> A patch against R-devel (at r88141) implementing this proposal is attached.
>> Mikael
>
> --
> Dr. Benjamin Bolker
> Professor, Mathematics & Statistics and Biology, McMaster University
> Director, School of Computational Science and Engineering
> > E-mail is sent at my convenience; I don't expect replies outside of working hours.
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business SchoolSolbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
More information about the R-devel
mailing list