# [R] Orders of terms in formulae

Bill.Venables@CMIS.CSIRO.AU Bill.Venables at CMIS.CSIRO.AU
Tue Jan 21 01:53:03 CET 2003

```Simon Wotherspoon suggests:

>  -----Original Message-----
> From: 	Simon Wotherspoon [mailto:Simon.Wotherspoon at utas.edu.au]
> Sent:	Tuesday, January 21, 2003 10:08 AM
> To:	r-help at stat.math.ethz.ch
> Subject:	[R] Orders of terms in formulae
>
> Hi,
>
> Given that R reports Type I sums of squares, isn't it a bit anachronistic
> that it re-orders terms in formulae?
[WNV]  R does not purport to report Type I anything.  The anova and
summary.aov functions, by default, report sequential analyses of variance
tables with factor terms ordered by their degree.

> > d <- expand.grid(y=rnorm(8),
> +             A=factor(c(1,2)),
> +             B=factor(c(1,2)),
> +             C=factor(c(1,2)))
> > summary(aov(y ~ A+B+A:B+C,data=d))
>             Df    Sum Sq   Mean Sq   F value Pr(>F)
> A            1 8.294e-34 8.294e-34 1.027e-33      1
> B            1 3.961e-33 3.961e-33 4.904e-33      1
> C            1 3.980e-34 3.980e-34 4.927e-34      1
> A:B          1 1.294e-32 1.294e-32 1.601e-32      1
> Residuals   59    47.658     0.808
[WNV]  You can change the default, of course.
> d <- expand.grid(A = factor(1:2),
B = factor(1:2), C = factor(1:2))
> d\$y <- rnorm(8)
> fm <- aov(terms(y ~ A*B + C, keep.order=T), d)
> summary(fm)
Df  Sum Sq Mean Sq F value  Pr(>F)
A            1 2.70009 2.70009  5.9045 0.09334
B            1 2.34018 2.34018  5.1175 0.10871
A:B          1 0.46492 0.46492  1.0167 0.38758
C            1 2.33694 2.33694  5.1104 0.10886
Residuals    3 1.37187 0.45729
> anova(fm)
Analysis of Variance Table

Response: y
Df  Sum Sq Mean Sq F value  Pr(>F)
A          1 2.70009 2.70009  5.9045 0.09334
B          1 2.34018 2.34018  5.1175 0.10871
A:B        1 0.46492 0.46492  1.0167 0.38758
C          1 2.33694 2.33694  5.1104 0.10886
Residuals  3 1.37187 0.45729
>
[WNV]  I realise this dodge is a little arcane, but it is important.
>
> Or have I missed the point?
[WNV]  I think you have missed the point a bit.  If you did fit a
model of the kind

aov(y ~ A*B*C + C*D*E, dat)

and you wanted the main effects to be included first in the model
and then the interactions ordered by degree, (as would be usual), how would
you ensure that happening (without tediously spelling it all out) if your
convention were in force?

This convention extends far beyond R and S-PLUS, by the way.  It
probably started with Genstat back in the middle ages.

The FAQ, by the way, has something to say abou these kinds of
issues, but not this one precisely (7.20 comes close).  Perhaps this should
find its way into that august document.

> Simon.
> ---
>
Bill Venables,
CMIS, CSIRO Marine Laboratories,
PO Box 120, Cleveland, Qld. 4163
AUSTRALIA
Phone:  +61 7 3826 7251
Fax:    +61 7 3826 7304
Mobile: +61 419 634 642
<mailto: Bill.Venables at csiro.au>
http://www.cmis.csiro.au/bill.venables/
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help

```