[R] How to test omitted level from a multiple level factor against overall mean in regression models?

Gabor Grothendieck ggrothendieck at gmail.com
Sun Mar 25 14:11:20 CEST 2012


2012/3/25 "Biedermann, Jürgen" <Juergen.Biedermann at charite.de>:
> Hi there,
>
> I have a linear model with one factor having three levels.
> I want to check if the different levels significantly differ from the overall mean (using contr.sum).
> However one level (the last) is omitted in the standard procedure.
>
> To illustrate this:
>
> x <- as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
> y <- c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
> test <- data.frame(x,y)
> reg1 <- lm(y~C(x,contr.sum),data=test)
> summary(reg1)
>
> Coefficients:
>                 Estimate Std. Error t value Pr(>|t|)
> (Intercept)       1.63333    0.06577  24.834 8.48e-15 ***
> C(x, contr.sum)1 -0.48333    0.10792  -4.479  0.00033 ***
> C(x, contr.sum)2 -0.48333    0.08936  -5.409 4.70e-05 ***
>
> Is it possible to get the effect for the third level (against the overall mean) in the table too.
>
> I figured out:
>
> reg2 <- lm(y~C(relevel(x,3),contr.sum),data=test)
> summary(reg2)
>
> C(relevel(x, 3), contr.sum)1  0.96667    0.07951  12.158 8.24e-10 ***
> C(relevel(x, 3), contr.sum)2 -0.48333    0.10792  -4.479  0.00033 ***
>
>
> The first row now test the third level against the overall mean, but I find this approach not so convenient.
> Moreover, I wonder if it is meaningful at all regarding the cumulation of alpha error. Would a Bonferroni correction be sensible?
>

Try this:

> options(contrasts = c("contr.sum", "contr.poly"))
> reg1 <- lm(y~x,data=test)
> dummy.coef(reg1)
Full coefficients are

(Intercept):      1.633333
x:                       1          2          3
                -0.4833333 -0.4833333  0.9666667

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list