[R-sig-ME] stepwise model selection (of fixed effects only) using AIC?

Ben Bolker bbolker at gmail.com
Mon Jan 7 05:25:36 CET 2013


Steve Taylor <steve.taylor at ...> writes:

> 
> Hello mixed modellers,
 
> I note that one can fit mixed models and compare them by AIC using
>  anova(), as in example(glmer).
 
> But step() doesn't work on the models produced by glmer() -
> evidently because they are S4 objects, rather than some more
> philosophical objection to doing so.

It's sort of a combination, I think: on the one hand I wouldn't
stop someone who *really* wanted to from doing stepwise model
selection; (1) it's a free country and (2) there are *occasionally*
good reasons to do so.  On the other hand, I don't feel particularly
inspired to put a lot of effort into implementing this capability,
because it's so rarely a good idea.

> I want to select among combinations of fixed effect components only,
>  as one would do with step(glm()).
 
> My data set consists of two repeated measurements (say, at age 9 and
> age 11) on each of about 900 people. There are about 8 potential
> fixed effect covariates that I wish to evaluate, including
> interaction terms with Age (as a two-level factor).  The only random
> effect I'm assuming is an intercept per participant.

  I *think* that drop1() should work; this will allow you to do
stepwise regression in not more than 8 steps or so.  That is, at
each stage drop1() will tell you which of the remaining terms
in the model will most improve the model.  At worst, you will
have to run drop1(); update(model, . ~ . - worstpredictor) 8
times before you end up at the intercept-only model.  (Hopefully
you will get to stop before then, as none of the dropped-predictor
models will have a lower AIC than the full model.)
 
> Can anyone help with a way to do this?
> 
> Here's my attempt, ignoring the random effect during model selection.  
> Is this a reasonable thing to do?
> 
> modeldata1 = subset(modeldata, select= -Participant)
> glm0 = glm(Outcome ~ Age, data=modeldata1, family=binomial)
> glm1 = glm(Outcome ~ .*Age, data=modeldata1, family=binomial)
> scope = list(lower=formula(glm0), upper=formula(glm1))
> glm2 = step(glm0, scope, direction='forward')

  I generally prefer backward to forward selection ...

> mm0 = glmer(Outcome ~ (1|Participant) + Age, 
>       data=modeldata, family=binomial)
> mm2 = update(mm0, formula=update(formula(glm2),'. ~ (1|Participant) + .'))
> 
> Interestingly, the results are pretty much the same, from:
> library(effects)
> plot(allEffects(glm2))
> plot(allEffects(mm2))



More information about the R-sig-mixed-models mailing list