[R] Formula with no intercept

Gang Chen gangchen6 at gmail.com
Thu Apr 17 18:11:44 CEST 2008


Thanks both Harold Doran and Prof. Ripley for the suggestion.
Time*Group - 1 or Time*(Group-1) does seem better. However as Prof.
Ripley pointed out, it is a little complicated with the interactions.
For example,

======

> set.seed(1)
> group <- as.factor (sample (c("M","F"), 12, T))
> y <- rnorm(12)
> time <- as.factor (rep (1:4, 3))
> summary(fit <- lm ( y ~ time * group - 1))

Call:
lm(formula = y ~ time * group - 1)

Residuals:
         1          2          3          4          5          6          7
-5.122e-01  3.916e-01  5.985e-01  9.547e-01  5.122e-01  1.665e-16 -5.985e-01
         8          9         10         11         12
-9.547e-01  0.000e+00 -3.916e-01 -5.551e-17  2.220e-16

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
time1         1.12493    0.91795   1.225    0.288
time2         0.38984    0.91795   0.425    0.693
time3        -0.02273    0.64909  -0.035    0.974
time4        -1.26004    0.64909  -1.941    0.124
groupM       -0.12533    1.12426  -0.111    0.917
time2:groupM  0.08218    1.58994   0.052    0.961
time3:groupM  0.13187    1.58994   0.083    0.938
time4:groupM  2.32921    1.58994   1.465    0.217

Residual standard error: 0.918 on 4 degrees of freedom
Multiple R-squared: 0.6962,     Adjusted R-squared: 0.08858
F-statistic: 1.146 on 8 and 4 DF,  p-value: 0.4796
=========

There are totally 8 fixed effects listed above. I believe I can
interpret time1, time2, time3 and time4 as the fixed effects of those
4 levels of factor Time in groupF. But I'm not so sure about the other
4 fixed effects: are time2:groupM, time3:groupM, and time4:groupM the
fixed effect differences of those 3 levels of factor Time between
groupM and groupF? If so, what is groupM (the 5th)? Or are
time2:groupM, time3:groupM, and time4:groupM the difference (between
groupM and groupF) of the fixed effects of those 3 levels of time
factor versus time1 while groupM (the 5th) the fixed effect of time1
or groupM versus GroupF?

> packages such as multcomp can post-hoc test any (coherent) set of hypotheses you
> choose, irrespective of the model parametrization.

This does not seem true unless I'm missing something. See the following example:

===========
> set.seed(1)
> group <- as.factor (sample (c("M","F"), 12, T))
> y <- rnorm(12)
> time <- as.factor (rep (1:4, 3))
> fit <- lm(y ~ time * group)
> library(multcomp)
> summary(glht(fit, linfct=c("time1=0", "time2=0")))
Error in chrlinfct2matrix(linfct, names(beta)) :
  variable(s) 'time1' not found
> summary(glht(fit, linfct=c("time2=0", "time3=0")))

         Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = y ~ time * group)

Linear Hypotheses:
           Estimate Std. Error t value p value
time2 == 0  -0.7351     1.2982  -0.566   0.797
time3 == 0  -1.1477     1.1243  -1.021   0.533
(Adjusted p values reported -- single-step method)
==========

The problem is that glht doesn't allow any hypothesis involving time1
if intercept is included in the model specification. Any more
thoughts?

Thanks,
Gang


On 4/16/08, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> On Wed, 16 Apr 2008, Doran, Harold wrote:
>
>
> > R may not be giving you what you want, but it is doing the right thing.
> > You can change what the base category is through contrasts but you can't
> > get the marginal effects for every level of all factors because this
> > creates a linear dependence in the model matrix.
> >
>
>  I suspect that Time*Group - 1 or Time*(Group-1) come closer to the aim. It
> is the first factor in the model which is coded without contrasts in a
> no-intercept model.
>
>  Once you include interactions I think the 'convenience' is largely lost,
> and packages such as multcomp can post-hoc test any (coherent) set of
> hypotheses you choose, irrespective of the model parametrization.
>
>
>
> >
> > > -----Original Message-----
> > > From: r-help-bounces at r-project.org
> > > [mailto:r-help-bounces at r-project.org] On Behalf Of Gang Chen
> > > Sent: Monday, April 14, 2008 5:38 PM
> > > To: r-help at stat.math.ethz.ch
> > > Subject: [R] Formula with no intercept
> > >
> > > I'm trying to analyze a model with two variables, one is
> > > Group with two levels (male and female), and other is Time
> > > with four levels (T1, T2, T3 and T4). And for the convenience
> > > of post-hoc testing I wanted to consider a model with no
> > > intercept for factor Time, so I tried formula
> > >
> > > Group*(Time-1)
> > >
> > > However this seems to give me the following terms in the model
> > >
> > > GroupMale, GroupFemale, TimeT2, TimeT3, TimeT4,
> > > GroupMale:TimeT2, GroupMale:TimeT3, GroupMale:TimeT4,
> > > GroupFemale:TimeT2, GroupFemale:TimeT3, GroupFemale:TimeT4
> > >
> > > which is not exactly what I wanted. Also it seems (Group-1)*Time and
> > > (Group-1)*(Time-1) also give me exactly the same set of terms
> > > as Group*(Time-1).
> > >
> > > So I have some conceptual trouble understanding this. And how
> > > could I create a model with terms including all the levels of
> > > factor Time?
> > >
> > > Thanks,
> > > Gang



More information about the R-help mailing list