[R] Re-Post: Combining Factors in model.matrix

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jan 26 09:06:45 CET 2004


On Mon, 26 Jan 2004, Paul Boutros wrote:

> > On Sat, 24 Jan 2004 paul.boutros at utoronto.ca wrote:
> >
> > > I didn't get any response on this before, leading me to believe
> > > I've missed
> > > something fundamental.  Can anybody guide me in the correct
> > > direction for more help on this?
> 
> Thanks for your reply:
> 
> > You will need to explain to us why the object you list is `the design
> > matrix': have *you* a reference for that?  R is doing the conventional
> > thing, and I at least have no idea where your example comes from.
> 
> Perhaps I have used the wrong terminology?  My understanding of a design
> matrix is that it identifies the factors are present for a given experiment.

The design matrix is X in the regression usually represented by

y = Xb + e

and is called a model matrix in S/R.

> Here, I have a two factor experiment, where each factor has two levels.
> In the case I gave:
>    t1 t2
> 1   1  0
> 2   1  1
> 3   0  0
> 4   0  1
> 
> I had expected this to represent four distinct experiments where
> factor one is present in the first two and absent in the second two.

You can't have factors that are present/absent.  (You can have levels of
treatments which are present/absent.)  We understand the rows to represent
the individuals runs of a single experiment, but what do the columns
represent?

> > You seem to have coded variables t1 and t2 the opposite ways (the
> > reference level is 2 for t1 and 1 for t2), and your model has the fit at
> > levels t1=2,t1=1 constrained to pass through the origin.  I don't think R
> > has a simple syntax for that (although you can fake anything), and I find
> > it hard to believe that is actually what you want.
> 
> That wasn't my intention, I want to retain the intercept term and
> not constrain it to pass through the origin.

So why did you use ~ -1 + (t1+t2) ?  That explicitly removes the 
intercept.


> Paul
> 
> > >
> > > Paul
> > >
> > > =================================================
> > > I want to be able to create a design matrix with two factors.
> > For instance, if
> > > I have:
> > >
> > > > t1 <- factor(c(1,1,2,2));
> > > > t2 <- factor(c(1,2,1,2));
> > > > design <- model.matrix(~ -1 + (t1+t2));
> > > > design;
> > >   t11 t12 t22
> > > 1   1   0   0
> > > 2   1   0   1
> > > 3   0   1   0
> > > 4   0   1   1
> > >
> > > But the design matrix I want is:
> > >    t1 t2
> > > 1   1  0
> > > 2   1  1
> > > 3   0  0
> > > 4   0  1
> > >
> > > Actually, in general I'm struggling with the syntax for
> > formulating a design
> > > matrix I can write down on paper.  Is there a reference for
> > this beyond the R
> > > documentation?
> >
> > Chapter 6 of MASS has the most complete exposition (by Bill
> > Venables) that
> > I know of, and the White Book (Chambers & Hastie, 1992) goes well beyind
> > the R documentation (which uses it as the reference).
> >
> > --
> > Brian D. Ripley,                  ripley at stats.ox.ac.uk
> > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford,             Tel:  +44 1865 272861 (self)
> > 1 South Parks Road,                     +44 1865 272866 (PA)
> > Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list