[R] How to test omitted level from a multiple level factor against overall mean in regression models?
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Mar 25 14:11:20 CEST 2012
2012/3/25 "Biedermann, Jürgen" <Juergen.Biedermann at charite.de>:
> Hi there,
>
> I have a linear model with one factor having three levels.
> I want to check if the different levels significantly differ from the overall mean (using contr.sum).
> However one level (the last) is omitted in the standard procedure.
>
> To illustrate this:
>
> x <- as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
> y <- c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
> test <- data.frame(x,y)
> reg1 <- lm(y~C(x,contr.sum),data=test)
> summary(reg1)
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1.63333 0.06577 24.834 8.48e-15 ***
> C(x, contr.sum)1 -0.48333 0.10792 -4.479 0.00033 ***
> C(x, contr.sum)2 -0.48333 0.08936 -5.409 4.70e-05 ***
>
> Is it possible to get the effect for the third level (against the overall mean) in the table too.
>
> I figured out:
>
> reg2 <- lm(y~C(relevel(x,3),contr.sum),data=test)
> summary(reg2)
>
> C(relevel(x, 3), contr.sum)1 0.96667 0.07951 12.158 8.24e-10 ***
> C(relevel(x, 3), contr.sum)2 -0.48333 0.10792 -4.479 0.00033 ***
>
>
> The first row now test the third level against the overall mean, but I find this approach not so convenient.
> Moreover, I wonder if it is meaningful at all regarding the cumulation of alpha error. Would a Bonferroni correction be sensible?
>
Try this:
> options(contrasts = c("contr.sum", "contr.poly"))
> reg1 <- lm(y~x,data=test)
> dummy.coef(reg1)
Full coefficients are
(Intercept): 1.633333
x: 1 2 3
-0.4833333 -0.4833333 0.9666667
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list