[Rd] Why does poly work for unordered factors?

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Sat Oct 25 12:55:22 CEST 2025


>>>>> Deepayan Sarkar 
>>>>>     on Wed, 22 Oct 2025 16:43:49 +0530 writes:

    > On Wed, 22 Oct 2025 at 14:41, Martin Maechler
    > <maechler using stat.math.ethz.ch> wrote:
    >> 
    >> >>>>> Roland Fuß via R-devel
    >> >>>>>     on Wed, 22 Oct 2025 10:24:07 +0200 writes:
    >> 
    >> > This doesn't seem intended.
    >> 
    >> You are right.  The code change, reverting to previous behaviour
    >> notably for "Date",
    >> was prompted on this R-devel list,
    >> https://stat.ethz.ch/pipermail/r-devel/2022-July/081850.html
    >> 
    >> But that the change allows poly(<factor>, .) to work was overlooked (by
    >> me and anyone else ..) and is a bug we will change.
    >> 
    >> > See:
    >> 
    >> > https://stackoverflow.com/questions/79795583/why-does-poly-work-for-unordered-factors-it-previously-did-not-work
    >> 
    >> As was already raised in the above SO thread,
    >> what should happen for *ordered* factors is less obvious.
    >> A warning was proposed, but I thought that this was too harsh;
    >> hence, we could use message(), or just keep allowing it.
    >> 
    >> Opinions?


    > Given that we use contr.poly by default for ordered factors, I think
    > it's very natural to allow it (without even a message). In fact, it
    > would be a nice way to illustrate what contr.poly does; e.g.,

    >> y <- rnorm(100); g <- gl(5, 20, ordered = TRUE)
    >> summary(lm(y ~ g)) |> coefficients()
    > Estimate Std. Error     t value  Pr(>|t|)
    > (Intercept)  0.138970785  0.1020089  1.36233970 0.1763120
    > g.L         -0.182590696  0.2280989 -0.80048932 0.4254247
    > g.Q         -0.206493256  0.2280989 -0.90527968 0.3676074
    > g.C          0.003626904  0.2280989  0.01590058 0.9873471
    > g^4         -0.074807753  0.2280989 -0.32796199 0.7436621
    >> summary(lm(y ~ poly(g, 4))) |> coefficients()
    > Estimate Std. Error     t value  Pr(>|t|)
    > (Intercept)  0.13897078  0.1020089  1.36233970 0.1763120
    > poly(g, 4)1 -0.81657042  1.0200891 -0.80048932 0.4254247
    > poly(g, 4)2 -0.92346592  1.0200891 -0.90527968 0.3676074
    > poly(g, 4)3  0.01622001  1.0200891  0.01590058 0.9873471
    > poly(g, 4)4 -0.33455044  1.0200891 -0.32796199 0.7436621

    > Best,
    > -Deepayan

    >> 
    >> Martin
    >> 
    >> --
    >> Martin Maechler
    >> ETH Zurich  and   R Core team
    >> 
    >> > --
    >> > Dr. Roland Fuß
    >> 
    >> > Thünen-Institut für Agrarklimaschutz/
    >> > Thünen Institute of Climate-Smart Agriculture
    >> 
    >> > Bundesallee 65
    >> > D-38116 Braunschweig, Germany

I have committed a straightforward small change to R-devel(only) 
such that  poly(f, n)  now will _again_ (as in R <= 4.1.0)
signal an error if `f` is factor but not an _ordered_ factor:

------------------------------------------------------------------------
r88970 | maechler | 2025-10-25 12:48:37 +0200 (Sa, 25. Okt 2025) |

poly(<factor>, ..) should error (unless it is.ordered(.))
------------------------------------------------------------------------

Martin



More information about the R-devel mailing list