[R-sig-ME] model selection in lme4

Mon Feb 16 02:50:30 CET 2009

  It would be better to use AICc, but I'm not sure what I would
use for "number of parameters" for a random effect with n
levels: any number between 0.5 and n seems plausible!
Someone should send Shane Richards (who has done some
very nice work testing (Q)AIC(c) in ecological settings)
and see if he's willing to tackle this one, although I can
imagine he's getting sick of this kind of exercise ...

  Ben Bolker

Renwick, A. R. wrote:
> Just a quickie Ben,
> Are you saying that you would use AIC rather than AICc even with
> small sample size - due to difficulty in counting residual degrees of
freedom?
> Thanks
> Anna
> p.s. this forum really is fantastic
> 
> ________________________________________
> From: r-sig-mixed-models-bounces at r-project.org [r-sig-mixed-models-bounces at r-project.org] On Behalf Of Ben Bolker [bolker at ufl.edu]
> Sent: 15 February 2009 23:07
> To: Christopher David Desjardins
> Cc: r-sig-mixed-models at r-project.org; tahirajamil at yahoo.com
> Subject: Re: [R-sig-ME] model selection in lme4
> 
>   Some caution on this advice: you seem to be quoting
> the general advice on AIC/BIC/AICc
> 
>   1. The AIC/BIC distinction is between "best prediction"
> and "consistent estimation of true model" dimension, e.g.
> 
> Yang, Yuhong. 2005. Can the strengths of AIC and BIC be shared? A
> conflict between model identification and regression estimation.
> Biometrika 92, no. 4 (December 1): 937-950. doi:10.1093/biomet/92.4.937.
> 
>   I favor AIC on these grounds, but you can decide for yourself.
> 
>   2. For models with different random effects, AIC and BIC share
> a "degrees of freedom counting" problem with all model selection
> approaches -- there are two aspects here, (1) whether you are
> focused on individual-level prediction or population-level
> prediction (Vaida and Blanchard 2005, Spiegelhalter et al 2002)
> and (2) whether AIC/BIC share the boundary problems that
> also apply to likelihood ratio tests (Greven, Sonja. 2008. Non-Standard
> Problems in Inference for Additive and Linear Mixed Models. Göttingen,
> Germany: Cuvillier Verlag.
> http://www.cuvillier.de/flycms/en/html/30/-UickI3zKPS,3cEY=/Buchdetails.html?SID=wVZnpL8f0fbc.
> )
> 
>   3. AIC and BIC are asymptotic tests (which can be especially
> problematic with random effects problems, when there are not
> large number of random blocks -- this makes likelihood ratio
> tests NOT OK for fixed-effect comparisons with small numbers
> of blocks (Pinheiro and Bates 2000)).  If you want to use
> AICc then you are back to counting residual degrees of freedom ...
> as far as I know there isn't much guidance available on this
> issue.
> 
>   My bottom line:
> 
>   I would go ahead and use (Q)AIC with caution for data sets with large
> (?) numbers of blocks.  With smaller numbers of blocks I would probably
> try to find some kind of randomization/permutation approach to get a
> sense of the relevant size of delta-AIC values ...
>    ... or damn the torpedoes and see if you can get away with straight
> AIC.
> 
>   Ben Bolker
> 
> Christopher David Desjardins wrote:
>> You could use either the BIC or the AIC. My understanding is that the
>> AIC tends to favor overly complex models whereas the BIC tends to
>> favor parsimonious models. I am generally inclined to always use the
>> BIC. If you have a small sample size you might also consider using the
>> AICC which is a correction of the AIC for small sample sizes. That
>> said, in my experience the AICC still selects more complex models than
>> the BIC. Also if you have nested models you could use the chi-square
>> tests.
>> Cheers,
>> Chris
>>
>> On Feb 15, 2009, at 4:44 PM, Tahira Jamil wrote:
>>
>>> Hi
>>> I have run  GLMM models in lme4 with different fixed effects and
>>> random effects . But now the problem is model selction Is AIC or BIC
>>> results are definitive specially for Gernalized linear mixed models
>>> or what critera should I use for model selction. So I can decide
>>> which explantory variable should be in the model because I have more
>>> than 10 explantory variables and some are entering in the model as
>>> random effect. In some cases If AIC has lower value but BIC is
>>> comparatively high.
>>>    some suggestion for model selection would be highly appricated.
>>>
>>>    WIth best wishes
>>>    T Jamil
>>>    Ph.D student
>>>    Biometris
>>>    Wageningen University and Research centre Netherlands.
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> -----------------
>> Christopher David Desjardins
>> Ph.D. Student
>> Quantitative Methods in Education
>> Department of Educational Psychology
>> University of  Minnesota
>> http://blog.lib.umn.edu/desja004/educationalpsychology/
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> --
> Ben Bolker
> Associate professor, Biology Dep't, Univ. of Florida
> bolker at ufl.edu / www.zoology.ufl.edu/bolker
> GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> The University of Aberdeen is a charity registered in Scotland, No SC013683.

-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc