[Rd] Bug in model.matrix.default for higher-order interaction encoding when specific model terms are missing
Tyler
tylermw at gmail.com
Sat Nov 4 17:33:25 CET 2017
Hi Arie,
I understand what you're saying. The following excerpt out of the book
shows that F_j does not refer exclusively to categorical factors: "...the
rule does not do anything special for them, and it remains valid, in a
trivial sense, whenever any of the F_j is numeric rather than categorical."
Since F_j refers to both categorical and numeric variables, the behavior of
model.matrix is not consistent with the heuristic.
Best regards,
Tyler
On Sat, Nov 4, 2017 at 6:50 AM, Arie ten Cate <arietencate at gmail.com> wrote:
> Hello Tyler,
>
> I rephrase my previous mail, as follows:
>
> In your example, T_i = X1:X2:X3. Let F_j = X3. (The numerical
> variables X1 and X2 are not encoded at all.) Then T_{i(j)} = X1:X2,
> which in the example is dropped from the model. Hence the X3 in T_i
> must be encoded by dummy variables, as indeed it is.
>
> Arie
>
>
> On Thu, Nov 2, 2017 at 4:11 PM, Tyler <tylermw at gmail.com> wrote:
> > Hi Arie,
> >
> > The book out of which this behavior is based does not use factor (in this
> > section) to refer to categorical factor. I will again point to this
> > sentence, from page 40, in the same section and referring to the behavior
> > under question, that shows F_j is not limited to categorical factors:
> > "Numeric variables appear in the computations as themselves, uncoded.
> > Therefore, the rule does not do anything special for them, and it remains
> > valid, in a trivial sense, whenever any of the F_j is numeric rather than
> > categorical."
> >
> > Note the "... whenever any of the F_j is numeric rather than
> categorical."
> > Factor here is used in the more general sense of the word, not referring
> to
> > the R type "factor." The behavior of R does not match the heuristic that
> > it's citing.
> >
> > Best regards,
> > Tyler
> >
> > On Thu, Nov 2, 2017 at 2:51 AM, Arie ten Cate <arietencate at gmail.com>
> wrote:
> >>
> >> Hello Tyler,
> >>
> >> Thank you for searching for, and finding, the basic description of the
> >> behavior of R in this matter.
> >>
> >> I think your example is in agreement with the book.
> >>
> >> But let me first note the following. You write: "F_j refers to a
> >> factor (variable) in a model and not a categorical factor". However:
> >> "a factor is a vector object used to specify a discrete
> >> classification" (start of chapter 4 of "An Introduction to R".) You
> >> might also see the description of the R function factor().
> >>
> >> You note that the book says about a factor F_j:
> >> "... F_j is coded by contrasts if T_{i(j)} has appeared in the
> >> formula and by dummy variables if it has not"
> >>
> >> You find:
> >> "However, the example I gave demonstrated that this dummy variable
> >> encoding only occurs for the model where the missing term is the
> >> numeric-numeric interaction, ~(X1+X2+X3)^3-X1:X2."
> >>
> >> We have here T_i = X1:X2:X3. Also: F_j = X3 (the only factor). Then
> >> T_{i(j)} = X1:X2, which is dropped from the model. Hence the X3 in T_i
> >> must be encoded by dummy variables, as indeed it is.
> >>
> >> Arie
> >>
> >> On Tue, Oct 31, 2017 at 4:01 PM, Tyler <tylermw at gmail.com> wrote:
> >> > Hi Arie,
> >> >
> >> > Thank you for your further research into the issue.
> >> >
> >> > Regarding Stata: On the other hand, JMP gives model matrices that use
> >> > the
> >> > main effects contrasts in computing the higher order interactions,
> >> > without
> >> > the dummy variable encoding. I verified this both by analyzing the
> >> > linear
> >> > model given in my first example and noting that JMP has one more
> degree
> >> > of
> >> > freedom than R for the same model, as well as looking at the generated
> >> > model
> >> > matrices. It's easy to find a design where JMP will allow us fit our
> >> > model
> >> > with goodness-of-fit estimates and R will not due to the extra
> degree(s)
> >> > of
> >> > freedom required. Let's keep the conversation limited to R.
> >> >
> >> > I want to refocus back onto my original bug report, which was not for
> a
> >> > missing main effects term, but rather for a missing lower-order
> >> > interaction
> >> > term. The behavior of model.matrix.default() for a missing main
> effects
> >> > term
> >> > is a nice example to demonstrate how model.matrix encodes with dummy
> >> > variables instead of contrasts, but doesn't demonstrate the
> inconsistent
> >> > behavior my bug report highlighted.
> >> >
> >> > I went looking for documentation on this behavior, and the issue stems
> >> > not
> >> > from model.matrix.default(), but rather the terms() function in
> >> > interpreting
> >> > the formula. This "clever" replacement of contrasts by dummy variables
> >> > to
> >> > maintain marginality (presuming that's the reason) is not described
> >> > anywhere
> >> > in the documentation for either the model.matrix() or the terms()
> >> > function.
> >> > In order to find a description for the behavior, I had to look in the
> >> > underlying C code, buried above the "TermCode" function of the
> "model.c"
> >> > file, which says:
> >> >
> >> > "TermCode decides on the encoding of a model term. Returns 1 if
> variable
> >> > ``whichBit'' in ``thisTerm'' is to be encoded by contrasts and 2 if it
> >> > is to
> >> > be encoded by dummy variables. This is decided using the heuristic
> >> > described in Statistical Models in S, page 38."
> >> >
> >> > I do not have a copy of this book, and I suspect most R users do not
> as
> >> > well. Thankfully, however, some of the pages describing this behavior
> >> > were
> >> > available as part of Amazon's "Look Inside" feature--but if not for
> >> > that, I
> >> > would have no idea what heuristic R was using. Since those pages could
> >> > made
> >> > unavailable by Amazon at any time, at the very least we have an
> problem
> >> > with
> >> > a lack of documentation.
> >> >
> >> > However, I still believe there is a bug when comparing R's
> >> > implementation to
> >> > the heuristic described in the book. From Statistical Models in S,
> page
> >> > 38-39:
> >> >
> >> > "Suppose F_j is any factor included in term T_i. Let T_{i(j)} denote
> the
> >> > margin of T_i for factor F_j--that is, the term obtained by dropping
> F_j
> >> > from T_i. We say that T_{i(j)} has appeared in the formula if there is
> >> > some
> >> > term T_i' for i' < i such that T_i' contains all the factors appearing
> >> > in
> >> > T_{i(j)}. The usual case is that T_{i(j)} itself is one of the
> preceding
> >> > terms. Then F_j is coded by contrasts if T_{i(j)} has appeared in the
> >> > formula and by dummy variables if it has not"
> >> >
> >> > Here, F_j refers to a factor (variable) in a model and not a
> categorical
> >> > factor, as specified later in that section (page 40): "Numeric
> variables
> >> > appear in the computations as themselves, uncoded. Therefore, the rule
> >> > does
> >> > not do anything special for them, and it remains valid, in a trivial
> >> > sense,
> >> > whenever any of the F_j is numeric rather than categorical."
> >> >
> >> > Going back to my original example with three variables: X1 (numeric),
> X2
> >> > (numeric), X3 (categorical). This heuristic prescribes encoding
> X1:X2:X3
> >> > with contrasts as long as X1:X2, X1:X3, and X2:X3 exist in the
> formula.
> >> > When
> >> > any of the preceding terms do not exist, this heuristic tells us to
> use
> >> > dummy variables to encode the interaction (e.g. "F_j [the interaction
> >> > term]
> >> > is coded ... by dummy variables if it [any of the marginal terms
> >> > obtained by
> >> > dropping a single factor in the interaction] has not [appeared in the
> >> > formula]"). However, the example I gave demonstrated that this dummy
> >> > variable encoding only occurs for the model where the missing term is
> >> > the
> >> > numeric-numeric interaction, "~(X1+X2+X3)^3-X1:X2". Otherwise, the
> >> > interaction term X1:X2:X3 is encoded by contrasts, not dummy
> variables.
> >> > This
> >> > is inconsistent with the description of the intended behavior given in
> >> > the
> >> > book.
> >> >
> >> > Best regards,
> >> > Tyler
> >> >
> >> >
> >> > On Fri, Oct 27, 2017 at 2:18 PM, Arie ten Cate <arietencate at gmail.com
> >
> >> > wrote:
> >> >>
> >> >> Hello Tyler,
> >> >>
> >> >> I want to bring to your attention the following document: "What
> >> >> happens if you omit the main effect in a regression model with an
> >> >> interaction?"
> >> >>
> >> >> (https://stats.idre.ucla.edu/stata/faq/what-happens-if-you-o
> mit-the-main-effect-in-a-regression-model-with-an-interaction).
> >> >> This gives a useful review of the problem. Your example is Case 2: a
> >> >> continuous and a categorical regressor.
> >> >>
> >> >> The numerical examples are coded in Stata, and they give the same
> >> >> result as in R. Hence, if this is a bug in R then it is also a bug in
> >> >> Stata. That seems very unlikely.
> >> >>
> >> >> Here is a simulation in R of the above mentioned Case 2 in Stata:
> >> >>
> >> >> df <- expand.grid(socst=c(-1:1),grp=c("1","2","3","4"))
> >> >> print("Full model")
> >> >> print(model.matrix(~(socst+grp)^2 ,data=df))
> >> >> print("Example 2.1: drop socst")
> >> >> print(model.matrix(~(socst+grp)^2 -socst ,data=df))
> >> >> print("Example 2.2: drop grp")
> >> >> print(model.matrix(~(socst+grp)^2 -grp ,data=df))
> >> >>
> >> >> This gives indeed the following regressors:
> >> >>
> >> >> "Full model"
> >> >> (Intercept) socst grp2 grp3 grp4 socst:grp2 socst:grp3 socst:grp4
> >> >> "Example 2.1: drop socst"
> >> >> (Intercept) grp2 grp3 grp4 socst:grp1 socst:grp2 socst:grp3
> socst:grp4
> >> >> "Example 2.2: drop grp"
> >> >> (Intercept) socst socst:grp2 socst:grp3 socst:grp4
> >> >>
> >> >> There is a little bit of R documentation about this, based on the
> >> >> concept of marginality, which typically forbids a model having an
> >> >> interaction but not the corresponding main effects. (You might see
> the
> >> >> references in https://en.wikipedia.org/wiki/Principle_of_marginality
> )
> >> >> See "An Introduction to R", by Venables and Smith and the R Core
> >> >> Team. At the bottom of page 52 (PDF: 57) it says: "Although the
> >> >> details are complicated, model formulae in R will normally generate
> >> >> the models that an expert statistician would expect, provided that
> >> >> marginality is preserved. Fitting, for [a contrary] example, a model
> >> >> with an interaction but not the corresponding main effects will in
> >> >> general lead to surprising results ....".
> >> >> The Reference Manual states that the R functions dropterm() and
> >> >> addterm() resp. drop or add only terms such that marginality is
> >> >> preserved.
> >> >>
> >> >> Finally, about your singular matrix t(mm)%*%mm. This is in fact
> >> >> Example 2.1 in Case 2 discussed above. As discussed there, in Stata
> >> >> and in R the drop of the continuous variable has no effect on the
> >> >> degrees of freedom here: it is just a reparameterisation of the full
> >> >> model, protecting you against losing marginality... Hence the
> >> >> model.matrix 'mm' is still square and nonsingular after the drop of
> >> >> X1, unless of course when a row is removed from the matrix 'design'
> >> >> when before creating 'mm'.
> >> >>
> >> >> Arie
> >> >>
> >> >> On Sun, Oct 15, 2017 at 7:05 PM, Tyler <tylermw at gmail.com> wrote:
> >> >> > You could possibly try to explain away the behavior for a missing
> >> >> > main
> >> >> > effects term, since without the main effects term we don't have
> main
> >> >> > effect
> >> >> > columns in the model matrix used to compute the interaction columns
> >> >> > (At
> >> >> > best this is undocumented behavior--I still think it's a bug, as we
> >> >> > know
> >> >> > how we would encode the categorical factors if they were in fact
> >> >> > present.
> >> >> > It's either specified in contrasts.arg or using the default set in
> >> >> > options). However, when all the main effects are present, why would
> >> >> > the
> >> >> > three-factor interaction column not simply be the product of the
> main
> >> >> > effect columns? In my example: we know X1, we know X2, and we know
> >> >> > X3.
> >> >> > Why
> >> >> > does the encoding of X1:X2:X3 depend on whether we specified a
> >> >> > two-factor
> >> >> > interaction, AND only changes for specific missing interactions?
> >> >> >
> >> >> > In addition, I can use a two-term example similar to yours to show
> >> >> > how
> >> >> > this
> >> >> > behavior results in a singular covariance matrix when, given the
> >> >> > desired
> >> >> > factor encoding, it should not be singular.
> >> >> >
> >> >> > We start with a full factorial design for a two-level continuous
> >> >> > factor
> >> >> > and
> >> >> > a three-level categorical factor, and remove a single row. This
> >> >> > design
> >> >> > matrix does not leave enough degrees of freedom to determine
> >> >> > goodness-of-fit, but should allow us to obtain parameter estimates.
> >> >> >
> >> >> >> design = expand.grid(X1=c(1,-1),X2=c("A","B","C"))
> >> >> >> design = design[-1,]
> >> >> >> design
> >> >> > X1 X2
> >> >> > 2 -1 A
> >> >> > 3 1 B
> >> >> > 4 -1 B
> >> >> > 5 1 C
> >> >> > 6 -1 C
> >> >> >
> >> >> > Here, we first calculate the model matrix for the full model, and
> >> >> > then
> >> >> > manually remove the X1 column from the model matrix. This gives us
> >> >> > the
> >> >> > model matrix one would expect if X1 were removed from the model. We
> >> >> > then
> >> >> > successfully calculate the covariance matrix.
> >> >> >
> >> >> >> mm = model.matrix(~(X1+X2)^2,data=design)
> >> >> >> mm
> >> >> > (Intercept) X1 X2B X2C X1:X2B X1:X2C
> >> >> > 2 1 -1 0 0 0 0
> >> >> > 3 1 1 1 0 1 0
> >> >> > 4 1 -1 1 0 -1 0
> >> >> > 5 1 1 0 1 0 1
> >> >> > 6 1 -1 0 1 0 -1
> >> >> >
> >> >> >> mm = mm[,-2]
> >> >> >> solve(t(mm) %*% mm)
> >> >> > (Intercept) X2B X2C X1:X2B X1:X2C
> >> >> > (Intercept) 1 -1.0 -1.0 0.0 0.0
> >> >> > X2B -1 1.5 1.0 0.0 0.0
> >> >> > X2C -1 1.0 1.5 0.0 0.0
> >> >> > X1:X2B 0 0.0 0.0 0.5 0.0
> >> >> > X1:X2C 0 0.0 0.0 0.0 0.5
> >> >> >
> >> >> > Here, we see the actual behavior for model.matrix. The undesired
> >> >> > re-coding
> >> >> > of the model matrix interaction term makes the information matrix
> >> >> > singular.
> >> >> >
> >> >> >> mm = model.matrix(~(X1+X2)^2-X1,data=design)
> >> >> >> mm
> >> >> > (Intercept) X2B X2C X1:X2A X1:X2B X1:X2C
> >> >> > 2 1 0 0 -1 0 0
> >> >> > 3 1 1 0 0 1 0
> >> >> > 4 1 1 0 0 -1 0
> >> >> > 5 1 0 1 0 0 1
> >> >> > 6 1 0 1 0 0 -1
> >> >> >
> >> >> >> solve(t(mm) %*% mm)
> >> >> > Error in solve.default(t(mm) %*% mm) : system is computationally
> >> >> > singular:
> >> >> > reciprocal condition number = 5.55112e-18
> >> >> >
> >> >> > I still believe this is a bug.
> >> >> >
> >> >> > Best regards,
> >> >> > Tyler Morgan-Wall
> >> >> >
> >> >> > On Sun, Oct 15, 2017 at 1:49 AM, Arie ten Cate
> >> >> > <arietencate at gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> >> I think it is not a bug. It is a general property of interactions.
> >> >> >> This property is best observed if all variables are factors
> >> >> >> (qualitative).
> >> >> >>
> >> >> >> For example, you have three variables (factors). You ask for as
> many
> >> >> >> interactions as possible, except an interaction term between two
> >> >> >> particular variables. When this interaction is not a constant, it
> is
> >> >> >> different for different values of the remaining variable. More
> >> >> >> precisely: for all values of that variable. In other words: you
> have
> >> >> >> a
> >> >> >> three-way interaction, with all values of that variable.
> >> >> >>
> >> >> >> An even smaller example is the following script with only two
> >> >> >> variables, each being a factor:
> >> >> >>
> >> >> >> df <- expand.grid(X1=c("p","q"), X2=c("A","B","C"))
> >> >> >> print(model.matrix(~(X1+X2)^2 ,data=df))
> >> >> >> print(model.matrix(~(X1+X2)^2 -X1,data=df))
> >> >> >> print(model.matrix(~(X1+X2)^2 -X2,data=df))
> >> >> >>
> >> >> >> The result is:
> >> >> >>
> >> >> >> (Intercept) X1q X2B X2C X1q:X2B X1q:X2C
> >> >> >> 1 1 0 0 0 0 0
> >> >> >> 2 1 1 0 0 0 0
> >> >> >> 3 1 0 1 0 0 0
> >> >> >> 4 1 1 1 0 1 0
> >> >> >> 5 1 0 0 1 0 0
> >> >> >> 6 1 1 0 1 0 1
> >> >> >>
> >> >> >> (Intercept) X2B X2C X1q:X2A X1q:X2B X1q:X2C
> >> >> >> 1 1 0 0 0 0 0
> >> >> >> 2 1 0 0 1 0 0
> >> >> >> 3 1 1 0 0 0 0
> >> >> >> 4 1 1 0 0 1 0
> >> >> >> 5 1 0 1 0 0 0
> >> >> >> 6 1 0 1 0 0 1
> >> >> >>
> >> >> >> (Intercept) X1q X1p:X2B X1q:X2B X1p:X2C X1q:X2C
> >> >> >> 1 1 0 0 0 0 0
> >> >> >> 2 1 1 0 0 0 0
> >> >> >> 3 1 0 1 0 0 0
> >> >> >> 4 1 1 0 1 0 0
> >> >> >> 5 1 0 0 0 1 0
> >> >> >> 6 1 1 0 0 0 1
> >> >> >>
> >> >> >> Thus, in the second result, we have no main effect of X1. Instead,
> >> >> >> the
> >> >> >> effect of X1 depends on the value of X2; either A or B or C. In
> >> >> >> fact,
> >> >> >> this is a two-way interaction, including all three values of X2.
> In
> >> >> >> the third result, we have no main effect of X2, The effect of X2
> >> >> >> depends on the value of X1; either p or q.
> >> >> >>
> >> >> >> A complicating element with your example seems to be that your X1
> >> >> >> and
> >> >> >> X2 are not factors.
> >> >> >>
> >> >> >> Arie
> >> >> >>
> >> >> >> On Thu, Oct 12, 2017 at 7:12 PM, Tyler <tylermw at gmail.com> wrote:
> >> >> >> > Hi,
> >> >> >> >
> >> >> >> > I recently ran into an inconsistency in the way
> >> >> >> > model.matrix.default
> >> >> >> > handles factor encoding for higher level interactions with
> >> >> >> > categorical
> >> >> >> > variables when the full hierarchy of effects is not present.
> >> >> >> > Depending on
> >> >> >> > which lower level interactions are specified, the factor
> encoding
> >> >> >> > changes
> >> >> >> > for a higher level interaction. Consider the following minimal
> >> >> >> reproducible
> >> >> >> > example:
> >> >> >> >
> >> >> >> > --------------
> >> >> >> >
> >> >> >> >> runmatrix = expand.grid(X1=c(1,-1),X2=c(1,
> -1),X3=c("A","B","C"))>
> >> >> >> model.matrix(~(X1+X2+X3)^3,data=runmatrix) (Intercept) X1 X2
> X3B
> >> >> >> X3C
> >> >> >> X1:X2 X1:X3B X1:X3C X2:X3B X2:X3C X1:X2:X3B X1:X2:X3C
> >> >> >> > 1 1 1 1 0 0 1 0 0 0 0
> >> >> >> > 0 0
> >> >> >> > 2 1 -1 1 0 0 -1 0 0 0 0
> >> >> >> > 0 0
> >> >> >> > 3 1 1 -1 0 0 -1 0 0 0 0
> >> >> >> > 0 0
> >> >> >> > 4 1 -1 -1 0 0 1 0 0 0 0
> >> >> >> > 0 0
> >> >> >> > 5 1 1 1 1 0 1 1 0 1 0
> >> >> >> > 1 0
> >> >> >> > 6 1 -1 1 1 0 -1 -1 0 1 0
> >> >> >> > -1 0
> >> >> >> > 7 1 1 -1 1 0 -1 1 0 -1 0
> >> >> >> > -1 0
> >> >> >> > 8 1 -1 -1 1 0 1 -1 0 -1 0
> >> >> >> > 1 0
> >> >> >> > 9 1 1 1 0 1 1 0 1 0 1
> >> >> >> > 0 1
> >> >> >> > 10 1 -1 1 0 1 -1 0 -1 0 1
> >> >> >> > 0 -1
> >> >> >> > 11 1 1 -1 0 1 -1 0 1 0 -1
> >> >> >> > 0 -1
> >> >> >> > 12 1 -1 -1 0 1 1 0 -1 0 -1
> >> >> >> > 0 1
> >> >> >> > attr(,"assign")
> >> >> >> > [1] 0 1 2 3 3 4 5 5 6 6 7 7
> >> >> >> > attr(,"contrasts")
> >> >> >> > attr(,"contrasts")$X3
> >> >> >> > [1] "contr.treatment"
> >> >> >> >
> >> >> >> > --------------
> >> >> >> >
> >> >> >> > Specifying the full hierarchy gives us what we expect: the
> >> >> >> > interaction
> >> >> >> > columns are simply calculated from the product of the main
> effect
> >> >> >> columns.
> >> >> >> > The interaction term X1:X2:X3 gives us two columns in the model
> >> >> >> > matrix,
> >> >> >> > X1:X2:X3B and X1:X2:X3C, matching the products of the main
> >> >> >> > effects.
> >> >> >> >
> >> >> >> > If we remove either the X2:X3 interaction or the X1:X3
> >> >> >> > interaction,
> >> >> >> > we
> >> >> >> get
> >> >> >> > what we would expect for the X1:X2:X3 interaction, but when we
> >> >> >> > remove
> >> >> >> > the
> >> >> >> > X1:X2 interaction the encoding for X1:X2:X3 changes completely:
> >> >> >> >
> >> >> >> > --------------
> >> >> >> >
> >> >> >> >> model.matrix(~(X1+X2+X3)^3-X1:X3,data=runmatrix)
> (Intercept) X1
> >> >> >> >> X2
> >> >> >> X3B X3C X1:X2 X2:X3B X2:X3C X1:X2:X3B X1:X2:X3C
> >> >> >> > 1 1 1 1 0 0 1 0 0 0
> >> >> >> > 0
> >> >> >> > 2 1 -1 1 0 0 -1 0 0 0
> >> >> >> > 0
> >> >> >> > 3 1 1 -1 0 0 -1 0 0 0
> >> >> >> > 0
> >> >> >> > 4 1 -1 -1 0 0 1 0 0 0
> >> >> >> > 0
> >> >> >> > 5 1 1 1 1 0 1 1 0 1
> >> >> >> > 0
> >> >> >> > 6 1 -1 1 1 0 -1 1 0 -1
> >> >> >> > 0
> >> >> >> > 7 1 1 -1 1 0 -1 -1 0 -1
> >> >> >> > 0
> >> >> >> > 8 1 -1 -1 1 0 1 -1 0 1
> >> >> >> > 0
> >> >> >> > 9 1 1 1 0 1 1 0 1 0
> >> >> >> > 1
> >> >> >> > 10 1 -1 1 0 1 -1 0 1 0
> >> >> >> > -1
> >> >> >> > 11 1 1 -1 0 1 -1 0 -1 0
> >> >> >> > -1
> >> >> >> > 12 1 -1 -1 0 1 1 0 -1 0
> >> >> >> > 1
> >> >> >> > attr(,"assign")
> >> >> >> > [1] 0 1 2 3 3 4 5 5 6 6
> >> >> >> > attr(,"contrasts")
> >> >> >> > attr(,"contrasts")$X3
> >> >> >> > [1] "contr.treatment"
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> >> model.matrix(~(X1+X2+X3)^3-X2:X3,data=runmatrix)
> (Intercept) X1
> >> >> >> >> X2
> >> >> >> X3B X3C X1:X2 X1:X3B X1:X3C X1:X2:X3B X1:X2:X3C
> >> >> >> > 1 1 1 1 0 0 1 0 0 0
> >> >> >> > 0
> >> >> >> > 2 1 -1 1 0 0 -1 0 0 0
> >> >> >> > 0
> >> >> >> > 3 1 1 -1 0 0 -1 0 0 0
> >> >> >> > 0
> >> >> >> > 4 1 -1 -1 0 0 1 0 0 0
> >> >> >> > 0
> >> >> >> > 5 1 1 1 1 0 1 1 0 1
> >> >> >> > 0
> >> >> >> > 6 1 -1 1 1 0 -1 -1 0 -1
> >> >> >> > 0
> >> >> >> > 7 1 1 -1 1 0 -1 1 0 -1
> >> >> >> > 0
> >> >> >> > 8 1 -1 -1 1 0 1 -1 0 1
> >> >> >> > 0
> >> >> >> > 9 1 1 1 0 1 1 0 1 0
> >> >> >> > 1
> >> >> >> > 10 1 -1 1 0 1 -1 0 -1 0
> >> >> >> > -1
> >> >> >> > 11 1 1 -1 0 1 -1 0 1 0
> >> >> >> > -1
> >> >> >> > 12 1 -1 -1 0 1 1 0 -1 0
> >> >> >> > 1
> >> >> >> > attr(,"assign")
> >> >> >> > [1] 0 1 2 3 3 4 5 5 6 6
> >> >> >> > attr(,"contrasts")
> >> >> >> > attr(,"contrasts")$X3
> >> >> >> > [1] "contr.treatment"
> >> >> >> >
> >> >> >> >
> >> >> >> >> model.matrix(~(X1+X2+X3)^3-X1:X2,data=runmatrix)
> (Intercept) X1
> >> >> >> >> X2
> >> >> >> X3B X3C X1:X3B X1:X3C X2:X3B X2:X3C X1:X2:X3A X1:X2:X3B X1:X2:X3C
> >> >> >> > 1 1 1 1 0 0 0 0 0 0
> 1
> >> >> >> > 0 0
> >> >> >> > 2 1 -1 1 0 0 0 0 0 0
> -1
> >> >> >> > 0 0
> >> >> >> > 3 1 1 -1 0 0 0 0 0 0
> -1
> >> >> >> > 0 0
> >> >> >> > 4 1 -1 -1 0 0 0 0 0 0
> 1
> >> >> >> > 0 0
> >> >> >> > 5 1 1 1 1 0 1 0 1 0
> 0
> >> >> >> > 1 0
> >> >> >> > 6 1 -1 1 1 0 -1 0 1 0
> 0
> >> >> >> > -1 0
> >> >> >> > 7 1 1 -1 1 0 1 0 -1 0
> 0
> >> >> >> > -1 0
> >> >> >> > 8 1 -1 -1 1 0 -1 0 -1 0
> 0
> >> >> >> > 1 0
> >> >> >> > 9 1 1 1 0 1 0 1 0 1
> 0
> >> >> >> > 0 1
> >> >> >> > 10 1 -1 1 0 1 0 -1 0 1
> 0
> >> >> >> > 0 -1
> >> >> >> > 11 1 1 -1 0 1 0 1 0 -1
> 0
> >> >> >> > 0 -1
> >> >> >> > 12 1 -1 -1 0 1 0 -1 0 -1
> 0
> >> >> >> > 0 1
> >> >> >> > attr(,"assign")
> >> >> >> > [1] 0 1 2 3 3 4 4 5 5 6 6 6
> >> >> >> > attr(,"contrasts")
> >> >> >> > attr(,"contrasts")$X3
> >> >> >> > [1] "contr.treatment"
> >> >> >> >
> >> >> >> > --------------
> >> >> >> >
> >> >> >> > Here, we now see the encoding for the interaction X1:X2:X3 is
> now
> >> >> >> > the
> >> >> >> > interaction of X1 and X2 with a new encoding for X3 where each
> >> >> >> > factor
> >> >> >> level
> >> >> >> > is represented by its own column. I would expect, given the two
> >> >> >> > column
> >> >> >> > dummy variable encoding for X3, that the X1:X2:X3 column would
> >> >> >> > also
> >> >> >> > be
> >> >> >> two
> >> >> >> > columns regardless of what two-factor interactions we also
> >> >> >> > specified,
> >> >> >> > but
> >> >> >> > in this case it switches to three. If other two factor
> >> >> >> > interactions
> >> >> >> > are
> >> >> >> > missing in addition to X1:X2, this issue still occurs. This also
> >> >> >> > happens
> >> >> >> > regardless of the contrast specified in contrasts.arg for X3. I
> >> >> >> > don't
> >> >> >> > see
> >> >> >> > any reasoning for this behavior given in the documentation, so I
> >> >> >> > suspect
> >> >> >> it
> >> >> >> > is a bug.
> >> >> >> >
> >> >> >> > Best regards,
> >> >> >> > Tyler Morgan-Wall
> >> >> >> >
> >> >> >> > [[alternative HTML version deleted]]
> >> >> >> >
> >> >> >> > ______________________________________________
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list