[R-sig-eco] [R-sig-phylo] GEE and AIC

Emmanuel Paradis Emmanuel.Paradis at mpl.ird.fr
Mon Jun 2 15:34:58 CEST 2008


Le 30.05.2008 02:18, tgarland at ucr.edu a écrit :
> This is not a problem for contrasts of GLS-type methods. It can be a
> pain in the a** to code a whole bunch of categorical variables as dummy
> variables and then compute contrasts (depending on your software), but
> it is not a problem from the perspective of the math/stats.

You're right to add "depending on your software" because it's very easy
with R (saving the user's a**), but few people seem to know it. The
function model.matrix is the tool here. Suppose you have a factor with
four levels and two repetitions for each level:

R> (x <- gl(4, 2))
[1] 1 1 2 2 3 3 4 4
Levels: 1 2 3 4

Then you just specify to model.matrix that x is a predictor in the usual
R formula notation:

R> model.matrix(~ x)
   (Intercept) x2 x3 x4
1           1  0  0  0
2           1  0  0  0
3           1  1  0  0
4           1  1  0  0
5           1  0  1  0
6           1  0  1  0
7           1  0  0  1
8           1  0  0  1
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")$x
[1] "contr.treatment"

This is actually used to prepare the "model matrix" in a regression,
hence the column of 1's added for the intercept. If you just want the
dummy variables:

R> model.matrix(~ x)[, -1]
   x2 x3 x4
1  0  0  0
2  0  0  0
3  1  0  0
4  1  0  0
5  0  1  0
6  0  1  0
7  0  0  1
8  0  0  1

If you've got several such variables you can do model.matrix( ~ x+y+z),
possibly adding interactions too. If they are in a data.frame, then use
the option 'data'.

If you don't like the current coding, you can change it in the options
(see ?C):

R> options(contrasts = c("contr.sum", "contr.poly"))
R> model.matrix(~ x)[, -1]
   x1 x2 x3
1  1  0  0
2  1  0  0
3  0  1  0
4  0  1  0
5  0  0  1
6  0  0  1
7 -1 -1 -1
8 -1 -1 -1

model.matrix() is called internally by most regression functions (lm,
glm, gee, gls, lme, ...) so the user doesn't have to bother about it,
but it's nice to see how it works.
-- 
Emmanuel Paradis
IRD, Jakarta, Indonesia
     http://ape.mpl.ird.fr/



More information about the R-sig-ecology mailing list