[Rd] Why does poly work for unordered factors?

Roland Fuß ro|@nd@|u@@ @end|ng |rom thuenen@de
Mon Oct 27 06:57:19 CET 2025


Great. Thank you, Martin.

Please credit SO users Christoph and SamR instead of me. I was just 
brave enough to send an email to the list. Yes, spam is one of the 
reasons we can't have nice things but it's unfortunate that there isn't 
a low-barrier way for bug reporting. Sending your first email to this 
list is scary.

Roland

Am 25.10.2025 um 12:55 schrieb Martin Maechler:
>>>>>> Deepayan Sarkar
>>>>>>      on Wed, 22 Oct 2025 16:43:49 +0530 writes:
>      > On Wed, 22 Oct 2025 at 14:41, Martin Maechler
>      > <maechler using stat.math.ethz.ch> wrote:
>      >>
>      >> >>>>> Roland Fuß via R-devel
>      >> >>>>>     on Wed, 22 Oct 2025 10:24:07 +0200 writes:
>      >>
>      >> > This doesn't seem intended.
>      >>
>      >> You are right.  The code change, reverting to previous behaviour
>      >> notably for "Date",
>      >> was prompted on this R-devel list,
>      >> https://stat.ethz.ch/pipermail/r-devel/2022-July/081850.html
>      >>
>      >> But that the change allows poly(<factor>, .) to work was overlooked (by
>      >> me and anyone else ..) and is a bug we will change.
>      >>
>      >> > See:
>      >>
>      >> > https://stackoverflow.com/questions/79795583/why-does-poly-work-for-unordered-factors-it-previously-did-not-work
>      >>
>      >> As was already raised in the above SO thread,
>      >> what should happen for *ordered* factors is less obvious.
>      >> A warning was proposed, but I thought that this was too harsh;
>      >> hence, we could use message(), or just keep allowing it.
>      >>
>      >> Opinions?
>
>
>      > Given that we use contr.poly by default for ordered factors, I think
>      > it's very natural to allow it (without even a message). In fact, it
>      > would be a nice way to illustrate what contr.poly does; e.g.,
>
>      >> y <- rnorm(100); g <- gl(5, 20, ordered = TRUE)
>      >> summary(lm(y ~ g)) |> coefficients()
>      > Estimate Std. Error     t value  Pr(>|t|)
>      > (Intercept)  0.138970785  0.1020089  1.36233970 0.1763120
>      > g.L         -0.182590696  0.2280989 -0.80048932 0.4254247
>      > g.Q         -0.206493256  0.2280989 -0.90527968 0.3676074
>      > g.C          0.003626904  0.2280989  0.01590058 0.9873471
>      > g^4         -0.074807753  0.2280989 -0.32796199 0.7436621
>      >> summary(lm(y ~ poly(g, 4))) |> coefficients()
>      > Estimate Std. Error     t value  Pr(>|t|)
>      > (Intercept)  0.13897078  0.1020089  1.36233970 0.1763120
>      > poly(g, 4)1 -0.81657042  1.0200891 -0.80048932 0.4254247
>      > poly(g, 4)2 -0.92346592  1.0200891 -0.90527968 0.3676074
>      > poly(g, 4)3  0.01622001  1.0200891  0.01590058 0.9873471
>      > poly(g, 4)4 -0.33455044  1.0200891 -0.32796199 0.7436621
>
>      > Best,
>      > -Deepayan
>
>      >>
>      >> Martin
>      >>
>      >> --
>      >> Martin Maechler
>      >> ETH Zurich  and   R Core team
>      >>
>      >> > --
>      >> > Dr. Roland Fuß
>      >>
>      >> > Thünen-Institut für Agrarklimaschutz/
>      >> > Thünen Institute of Climate-Smart Agriculture
>      >>
>      >> > Bundesallee 65
>      >> > D-38116 Braunschweig, Germany
>
> I have committed a straightforward small change to R-devel(only)
> such that  poly(f, n)  now will _again_ (as in R <= 4.1.0)
> signal an error if `f` is factor but not an _ordered_ factor:
>
> ------------------------------------------------------------------------
> r88970 | maechler | 2025-10-25 12:48:37 +0200 (Sa, 25. Okt 2025) |
>
> poly(<factor>, ..) should error (unless it is.ordered(.))
> ------------------------------------------------------------------------
>
> Martin



More information about the R-devel mailing list