[R-sig-ME] Problem with the categorical predictor in the factor format at level 1
Ben Bolker
bbolker at gmail.com
Wed Feb 20 03:18:21 CET 2013
Sunthud Pornprasertmanit <psunthud at ...> writes:
>
> Dear all,
>
> I have run a model with fixed intercepts but random slopes on categorical
> predictors by the following command:
>
> FixedIntRandomSlope <- lmer(POPULAR ~ 1 + SEX + (0 + SEX|SCHOOL), data =
> popular, REML = FALSE)
> summary(FixedIntRandomSlope)
>
> I got the different results in the random effect when I treated SEX as
> dummy variable manually or treated SEX as factor.
>
> Here is the result for the dummy-variable predictor:
>
> Random effects:
> Groups Name Variance Std.Dev.
> SCHOOL SEX 0.87531 0.93558
> Residual 0.87053 0.93302
>
> Here is the result for the variable transformed into factor format:
>
> Random effects:
> Groups Name Variance Std.Dev. Corr
> SCHOOL SEX0 0.93044 0.96459
> SEX1 0.92104 0.95971 0.855
> Residual 0.39244 0.62645
>
> I think SEX0 and SEX1 should not be both random effects.
>
> I have checked predictor and found that the variable really have two
> categories:
>
> > summary(popular$SEX)
> 0 1
> 1026 974
>
> I use lme4 version lme4_0.999999-0.
>
> Please teach me what is going on in this case. Thank you very much.
>
I believe this is a weakness in the way that lme4 constructs
random effects. The problem is that it falls back on R's standard
model-matrix constructor (model.matrix()); in this case the formula
~0+SEX considered by itself gives rise to a "no-intercept" matrix,
which is *not* a one-column model matrix, but rather two columns
each corresponding to a dummy variable for the corresponding factor level.
For example:
d <- data.frame(SEX=factor(0:1))
model.matrix(~SEX,data=d)
## (Intercept) SEX1
## 1 1 0
## 2 1 1
model.matrix(~0+SEX,data=d)
## SEX0 SEX1
## 1 1 0
## 2 0 1
rather than the model matrix you want, which is just
## SEX1
## 1 0
## 2 1
The workaround is (as you have done) to create your own dummy
variable.
The other disturbing part of this is that the model with (~0+SEX|SCHOOL)
is actually unidentifiable (I think), but lmer goes ahead and fits
something for you anyway, without warning you.
This will definitely be worth posting an issue at
https://github.com/lme4/lme4/issues?state=open : if I get a
chance I will do it, but you are encouraged to do so ...
More information about the R-sig-mixed-models
mailing list