[R] Formula with no intercept
Gang Chen
gangchen6 at gmail.com
Thu Apr 17 18:11:44 CEST 2008
Thanks both Harold Doran and Prof. Ripley for the suggestion.
Time*Group - 1 or Time*(Group-1) does seem better. However as Prof.
Ripley pointed out, it is a little complicated with the interactions.
For example,
======
> set.seed(1)
> group <- as.factor (sample (c("M","F"), 12, T))
> y <- rnorm(12)
> time <- as.factor (rep (1:4, 3))
> summary(fit <- lm ( y ~ time * group - 1))
Call:
lm(formula = y ~ time * group - 1)
Residuals:
1 2 3 4 5 6 7
-5.122e-01 3.916e-01 5.985e-01 9.547e-01 5.122e-01 1.665e-16 -5.985e-01
8 9 10 11 12
-9.547e-01 0.000e+00 -3.916e-01 -5.551e-17 2.220e-16
Coefficients:
Estimate Std. Error t value Pr(>|t|)
time1 1.12493 0.91795 1.225 0.288
time2 0.38984 0.91795 0.425 0.693
time3 -0.02273 0.64909 -0.035 0.974
time4 -1.26004 0.64909 -1.941 0.124
groupM -0.12533 1.12426 -0.111 0.917
time2:groupM 0.08218 1.58994 0.052 0.961
time3:groupM 0.13187 1.58994 0.083 0.938
time4:groupM 2.32921 1.58994 1.465 0.217
Residual standard error: 0.918 on 4 degrees of freedom
Multiple R-squared: 0.6962, Adjusted R-squared: 0.08858
F-statistic: 1.146 on 8 and 4 DF, p-value: 0.4796
=========
There are totally 8 fixed effects listed above. I believe I can
interpret time1, time2, time3 and time4 as the fixed effects of those
4 levels of factor Time in groupF. But I'm not so sure about the other
4 fixed effects: are time2:groupM, time3:groupM, and time4:groupM the
fixed effect differences of those 3 levels of factor Time between
groupM and groupF? If so, what is groupM (the 5th)? Or are
time2:groupM, time3:groupM, and time4:groupM the difference (between
groupM and groupF) of the fixed effects of those 3 levels of time
factor versus time1 while groupM (the 5th) the fixed effect of time1
or groupM versus GroupF?
> packages such as multcomp can post-hoc test any (coherent) set of hypotheses you
> choose, irrespective of the model parametrization.
This does not seem true unless I'm missing something. See the following example:
===========
> set.seed(1)
> group <- as.factor (sample (c("M","F"), 12, T))
> y <- rnorm(12)
> time <- as.factor (rep (1:4, 3))
> fit <- lm(y ~ time * group)
> library(multcomp)
> summary(glht(fit, linfct=c("time1=0", "time2=0")))
Error in chrlinfct2matrix(linfct, names(beta)) :
variable(s) 'time1' not found
> summary(glht(fit, linfct=c("time2=0", "time3=0")))
Simultaneous Tests for General Linear Hypotheses
Fit: lm(formula = y ~ time * group)
Linear Hypotheses:
Estimate Std. Error t value p value
time2 == 0 -0.7351 1.2982 -0.566 0.797
time3 == 0 -1.1477 1.1243 -1.021 0.533
(Adjusted p values reported -- single-step method)
==========
The problem is that glht doesn't allow any hypothesis involving time1
if intercept is included in the model specification. Any more
thoughts?
Thanks,
Gang
On 4/16/08, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> On Wed, 16 Apr 2008, Doran, Harold wrote:
>
>
> > R may not be giving you what you want, but it is doing the right thing.
> > You can change what the base category is through contrasts but you can't
> > get the marginal effects for every level of all factors because this
> > creates a linear dependence in the model matrix.
> >
>
> I suspect that Time*Group - 1 or Time*(Group-1) come closer to the aim. It
> is the first factor in the model which is coded without contrasts in a
> no-intercept model.
>
> Once you include interactions I think the 'convenience' is largely lost,
> and packages such as multcomp can post-hoc test any (coherent) set of
> hypotheses you choose, irrespective of the model parametrization.
>
>
>
> >
> > > -----Original Message-----
> > > From: r-help-bounces at r-project.org
> > > [mailto:r-help-bounces at r-project.org] On Behalf Of Gang Chen
> > > Sent: Monday, April 14, 2008 5:38 PM
> > > To: r-help at stat.math.ethz.ch
> > > Subject: [R] Formula with no intercept
> > >
> > > I'm trying to analyze a model with two variables, one is
> > > Group with two levels (male and female), and other is Time
> > > with four levels (T1, T2, T3 and T4). And for the convenience
> > > of post-hoc testing I wanted to consider a model with no
> > > intercept for factor Time, so I tried formula
> > >
> > > Group*(Time-1)
> > >
> > > However this seems to give me the following terms in the model
> > >
> > > GroupMale, GroupFemale, TimeT2, TimeT3, TimeT4,
> > > GroupMale:TimeT2, GroupMale:TimeT3, GroupMale:TimeT4,
> > > GroupFemale:TimeT2, GroupFemale:TimeT3, GroupFemale:TimeT4
> > >
> > > which is not exactly what I wanted. Also it seems (Group-1)*Time and
> > > (Group-1)*(Time-1) also give me exactly the same set of terms
> > > as Group*(Time-1).
> > >
> > > So I have some conceptual trouble understanding this. And how
> > > could I create a model with terms including all the levels of
> > > factor Time?
> > >
> > > Thanks,
> > > Gang
More information about the R-help
mailing list