[R] bad label change in step() from lmerTest package

Bastien.Ferland-Raymond at mffp.gouv.qc.ca Bastien.Ferland-Raymond at mffp.gouv.qc.ca
Tue Dec 16 15:52:17 CET 2014


Hello list,

I recently started working with the step() function in the lmerTest package and I notice a weird behavior that may be a bug.  The package perform stepwise selection of fixed and random effects, however when it discard the random variable because not significant, it changes the label of the dependant variable in the best model formula. 

Here is a reproducible example :

### load de library :
library(lmerTest)

###  data preparation
set.seed(1234)

## the Xs
x1 = rnorm(100,23,2)
x2 = rnorm(100,15,3)
x3 = rnorm(100,5,2)
x4 = rnorm(100,10,5)

## the dependant variable
dep = (2 * x1 +  rnorm(100,0,5)) + (-4 * x2 +  rnorm(100,0,1)) + (0.1 * x3 +  rnorm(100,0,3)) + (1 * x4 +  rnorm(100,0,8))

## the random variable, one good (significant) and one bad (not-significant)
good.random = as.character(cut(dep+rnorm(100,0,2),3, c("group1","group2","group3")))
bad.random = sample(c("group1","group2","group3"), 100, replace=T)

###  we make the starting models, one with the good and one with the bad random variable
mod.good <- lmer(dep ~ x1+x2+x3+x4+(1|good.random))
mod.bad  <-   lmer(dep ~ x1+x2+x3+x4+(1|bad.random))

### we do the stepwise selection
select.good <- step(mod.good) 		# should keep the random variable
select.bad <- step(mod.bad)			# should remove the random variable

###  The label of the dependant variable change between model where the random effect was removed and the one where it was kept.
formula(select.good$model)
# output : dep ~ x1 + x2 + x4 + (1 | good.random)
# it's what it's suppose to be : dep ~

formula(select.bad$model)
#output : y ~ x1 + x2 + x3 + x4
# here, it's change by : y ~
### end code

This is problematic when you're doing automatic model selection.  Is it an option that I missed or a bug?
Also, it's interesting to notice that the stepwise selection of the model with the bad random variable didn't remove the variable x3 which is clearly not significant.  So I wonder if the function is doing selection of fixed effects after having removed the random effects.

Thanks,



Bastien Ferland-Raymond, M.Sc. Stat., M.Sc. Biol.
Division des orientations et projets spéciaux
Direction des inventaires forestiers
Ministère des Forêts, de la Faune et des Parcs 



More information about the R-help mailing list