[R] Why does the order of terms in a formula translate into different models/ model matrices?

Sun Jan 29 02:42:41 CET 2012

 <cberry <at> tajo.ucsd.edu> writes:

> 
> Alexandra <alku <at> imm.dtu.dk> writes:
> 
[snip]

 Close, but not quite. The problem lies in terms()

 Here are the attr(terms(...),"factors") matrices:

  > attributes(terms(Y ~ x:A + A:B,data=dat))$factors
   x:A A:B
 Y   0   0
 x   2   0
 A   2   2
 B   0   1
 >   attributes(terms(Y ~ A:B + x:A ,data=dat))$factors
   A:B A:x
 Y   0   0
 A   2   2
 B   2   0
 x   0   1

 As you see, the encoding of x and B are treated differently under the
 two orderings.

 See ?terms.object for what those codes mean.

 Same deal for these seemingly equivalent formulae:

 >   attributes(terms(Y ~ (x + A + B)^2-A,data=dat))$factors
   x B x:A x:B A:B
 Y 0 0   0   0   0
 x 1 0   2   1   0
 A 0 0   1   0   1
 B 0 1   0   1   1
 >   attributes(terms(Y ~ (A + B + x)^2-A,data=dat))$factors
   B x A:B A:x B:x
 Y 0 0   0   0   0
 A 0 0   1   1   0
 B 1 0   2   0   1
 x 0 1   0   1   1
 > 

(quoting removed to make Gmane happy)

 AFAICS, this is a bug.

  I think so too, although I haven't got my head around it yet.

  Chuck, are you willing to post a summary of this to r-devel
for discussion ... and/or post a bug report?