[R] model.matrix for a factor effect with no intercept
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Feb 23 23:25:09 CET 2005
David,
I am sorry, I did not read your question as `why dummy coding not poly
coding', but I still seemed to have answered it. BTW,
contr.poly(a, contrasts = FALSE)
should be an indicator matrix according to its help page, so I would not
have expected what you expected, and if it had been, you might not have
been surprised. [More on that below.]
I would say the algorithm was `code by the specified contrasts only if
dummy coding is not parsimonious'.
Your final point isn't the whole story. If I have 0+a+b+a:b, R uses the
contrasts for a in the interaction but not the main term. So the
attribute "contrasts" for 'a' represents
`the contrast used for those terms (possibly none) in which it needed
coding'.
Now, if there were none it might be better to say nothing, but that is not
actually calculated explicitly. I've just checked, and we do not document
on the help page what it means. I have just drafted
If there are any factors in terms in the model, there is an attribute
\code{"contrasts"}, a named list with an entry for each factor. This
specifies the contrasts that would be used in terms in which the
factor is coded by contrasts (in some terms dummy coding may be used),
either as a character vector naming a function or as a numeric matrix.
As far as I recall the contrasts attribute is returned to be tacked on a
fit and used later to reconstruct the model matrix, or perhaps a model
matrix for another formula and the same data. So I think it does need to
be the contrasts as specified at the time of the model.matrix call.
It is probably helpful to note that it is not specifically treatment
contrasts that are used but dummy coding: contr.sum and contr.helmert give
the same matrix.
It is a bug that contr.poly(x, contrasts=FALSE) does not behave as
documented (a bug shared with S, it seems). I do dimly recall this having
been raised in the past (probably by me), but cannot find any record as to
why the discrepancy remains. We should change the code or the
documentation, probably the latter.
Brian
On Wed, 23 Feb 2005, David Firth wrote:
> Brian, many thanks for these helpful pointers:
>
> On 23 Feb 2005, at 15:45, Prof Brian Ripley wrote:
>
>> MASS4 p.150
>> White Book p.38
>>
>> Those are the only two reasonably comprehensive accounts that I am aware of
>> (and they have only partial overlap).
>>
>> The underlying motivation is to span the _additional_ vector space covered
>> by the term, the complement to what has gone before. Put another way, as
>> each term is added, only enough columns are added to the model matrix to
>> span the same space as if dummy coding had been used for that term and its
>> predecessors. So think of this as a way to produce a parsimonious (usually
>> full-rank) basis for the model space.
>
> Yes, indeed. My surprise was that *this* particular basis (dummy coding) was
> the one used here. I should have got a clue from what contrasts() does, and
> is documented to do,
>
>> options("contrasts")
> $contrasts
> [1] "contr.treatment" "contr.poly"
>> contrasts(a)
> .L .Q
> [1,] -7.071068e-01 0.4082483
> [2,] -9.073800e-17 -0.8164966
> [3,] 7.071068e-01 0.4082483
>
> but when there's no intercept in the model the contrasts used appear to be
>
>> contrasts(a, contrasts = FALSE)
> -1 0 1
> -1 1 0 0
> 0 0 1 0
> 1 0 0 1
>
> which are not the same as
>
>> contr.poly(a, contrasts = FALSE)
> ^0 ^1 ^2
> [1,] 1 -7.071068e-01 0.4082483
> [2,] 1 -9.073800e-17 -0.8164966
> [3,] 1 7.071068e-01 0.4082483
>
> -- which is what I had naively expected to get in my model matrix.
> This is of course all a matter of convention. The present convention does
> seem a touch confusing though: the basis for the space spanned by a factor is
> determined by options("contrasts") or by the contrasts attribute of the
> factor or by the contrasts argument in the call, *except* when there's no
> intercept or other factor earlier in the model in which case all such
> settings are ignored (well, not *quite* ignored: they do get put in the
> contrasts attribute of the resultant model matrix).
>
> On the other hand, one good reason to use dummy coding for the
> first-encountered factor when there's no intercept is that the associated
> parameters are then often interpretable as group-specific intercepts.
>
> Would it be an improvement, though, if the "contrasts" attribute of the
> resultant model matrix contained "contr.treatment" in such cases instead of
> the name of a contrast function that was not actually used?
>
> Best wishes,
> David
>
>
>>
>> On Wed, 23 Feb 2005, David Firth wrote:
>>
>>> I was surprised by this (in R 2.0.1):
>>>
>>>> a <- ordered(-1:1)
>>>> a
>>> [1] -1 0 1
>>> Levels: -1 < 0 < 1
>>>
>>>> model.matrix(~ a)
>>> (Intercept) a.L a.Q
>>> 1 1 -7.071068e-01 0.4082483
>>> 2 1 -9.073800e-17 -0.8164966
>>> 3 1 7.071068e-01 0.4082483
>>> attr(,"assign")
>>> [1] 0 1 1
>>> attr(,"contrasts")
>>> attr(,"contrasts")$a
>>> [1] "contr.poly"
>>>
>>>> model.matrix(~ -1 + a)
>>> a-1 a0 a1
>>> 1 1 0 0
>>> 2 0 1 0
>>> 3 0 0 1
>>> attr(,"assign")
>>> [1] 1 1 1
>>> attr(,"contrasts")
>>> attr(,"contrasts")$a
>>> [1] "contr.poly"
>>>
>>> Without the intercept, treatment contrasts seem to have been used (this
>>> despite the "contr.poly" in the "contrasts" attribute).
>>>
>>> It's not restricted to ordered factors. For example, if Helmert contrasts
>>> are used for nominal factors, the same sort of thing happens.
>>>
>>> I suppose it is a deliberate feature (perhaps to protect the user from
>>> accidentally fitting models that make no sense? or maybe some better
>>> reason?) -- is it explained somewhere?
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list