[R-sig-ME] model selection in lme4

Ben Bolker bolker at ufl.edu
Mon Feb 16 00:07:04 CET 2009

  Some caution on this advice: you seem to be quoting
the general advice on AIC/BIC/AICc

  1. The AIC/BIC distinction is between "best prediction"
and "consistent estimation of true model" dimension, e.g.

Yang, Yuhong. 2005. Can the strengths of AIC and BIC be shared? A
conflict between model identification and regression estimation.
Biometrika 92, no. 4 (December 1): 937-950. doi:10.1093/biomet/92.4.937.

  I favor AIC on these grounds, but you can decide for yourself.

  2. For models with different random effects, AIC and BIC share
a "degrees of freedom counting" problem with all model selection
approaches -- there are two aspects here, (1) whether you are
focused on individual-level prediction or population-level
prediction (Vaida and Blanchard 2005, Spiegelhalter et al 2002)
and (2) whether AIC/BIC share the boundary problems that
also apply to likelihood ratio tests (Greven, Sonja. 2008. Non-Standard
Problems in Inference for Additive and Linear Mixed Models. Göttingen,
Germany: Cuvillier Verlag.

  3. AIC and BIC are asymptotic tests (which can be especially
problematic with random effects problems, when there are not
large number of random blocks -- this makes likelihood ratio
tests NOT OK for fixed-effect comparisons with small numbers
of blocks (Pinheiro and Bates 2000)).  If you want to use
AICc then you are back to counting residual degrees of freedom ...
as far as I know there isn't much guidance available on this

  My bottom line:

  I would go ahead and use (Q)AIC with caution for data sets with large
(?) numbers of blocks.  With smaller numbers of blocks I would probably
try to find some kind of randomization/permutation approach to get a
sense of the relevant size of delta-AIC values ...
   ... or damn the torpedoes and see if you can get away with straight

  Ben Bolker

Christopher David Desjardins wrote:
> You could use either the BIC or the AIC. My understanding is that the  
> AIC tends to favor overly complex models whereas the BIC tends to  
> favor parsimonious models. I am generally inclined to always use the  
> BIC. If you have a small sample size you might also consider using the  
> AICC which is a correction of the AIC for small sample sizes. That  
> said, in my experience the AICC still selects more complex models than  
> the BIC. Also if you have nested models you could use the chi-square  
> tests.
> Cheers,
> Chris
> On Feb 15, 2009, at 4:44 PM, Tahira Jamil wrote:
>> Hi
>> I have run  GLMM models in lme4 with different fixed effects and  
>> random effects . But now the problem is model selction Is AIC or BIC  
>> results are definitive specially for Gernalized linear mixed models  
>> or what critera should I use for model selction. So I can decide  
>> which explantory variable should be in the model because I have more  
>> than 10 explantory variables and some are entering in the model as  
>> random effect. In some cases If AIC has lower value but BIC is  
>> comparatively high.
>>    some suggestion for model selection would be highly appricated.
>>    WIth best wishes
>>    T Jamil
>>    Ph.D student
>>    Biometris
>>    Wageningen University and Research centre Netherlands.
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> -----------------
> Christopher David Desjardins
> Ph.D. Student
> Quantitative Methods in Education
> Department of Educational Psychology
> University of  Minnesota
> http://blog.lib.umn.edu/desja004/educationalpsychology/
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc

More information about the R-sig-mixed-models mailing list