[R] Insignificant variable improves AIC (multinom)?
Ravi Varadhan
rvaradhan at jhmi.edu
Sat Jun 13 19:43:42 CEST 2009
Oops. In my previous email I meant to say the following:
In the AIC approach, you include a new variable or delete an existing variable when the change in the "log-likelihood" value is 2 or more.
Ravi.
____________________________________________________________________
Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University
Ph. (410) 502-2619
email: rvaradhan at jhmi.edu
----- Original Message -----
From: Ravi Varadhan <rvaradhan at jhmi.edu>
Date: Saturday, June 13, 2009 1:40 pm
Subject: Re: [R] Insignificant variable improves AIC (multinom)?
To: Werner Wernersen <pensterfuzzer at yahoo.de>
Cc: r-help at stat.math.ethz.ch
> Hi Werner,
>
> AICs of nested models are compared on additive scale, not on
> multiplicative scale. So, you have to think about how much the AIC is
> decreased when you add the new variable, not the factor by which it is
> reduced.
>
> If you are doing a stepwise selection based on AIC, then the p-value
> approach and AIC approach are related. In the AIC approach, you
> include a new variable or delete an existing variable when the change
> in AIC score is 2 or more. In the stepwise likelihood ratio test,
> LRT, (a.k.a. F-test in linear regression), to select variables, the
> AIC score change of 2 corresponds roughly to a p-value of 0.15, i.e.
> entering or deleting a variable if the p-value for the LRT is less
> than 0.15.
>
> Of course, the big issue is that the sampling properties of stepwise
> model selection procedures are extremely difficult to characterize.
> Resampling and cross-validation approaches can help address this
> problem. Another more principled approach to model selection is to use
> regularization methods (e.g. ridge, lasso). But there is no free
> lunch. In regularization methods, one has to decide on the degree of
> regularization.
>
> I hope I have successfully convinced you about the perils and
> pitfalls of model selection.
>
> Best,
> Ravi.
> ____________________________________________________________________
>
> Ravi Varadhan, Ph.D.
> Assistant Professor,
> Division of Geriatric Medicine and Gerontology
> School of Medicine
> Johns Hopkins University
>
> Ph. (410) 502-2619
> email: rvaradhan at jhmi.edu
>
>
> ----- Original Message -----
> From: Werner Wernersen <pensterfuzzer at yahoo.de>
> Date: Saturday, June 13, 2009 10:52 am
> Subject: Re: [R] Insignificant variable improves AIC (multinom)?
> To: Peter Flom <peterflomconsulting at mindspring.com>, r-help at stat.math.ethz.ch
>
>
> > > >Hi,
> >
> > > >
> > > >I am trying to specify a multinomial logit model using the
> > multinom function
> > > from the nnet package. Now I add another independent variable
> and
> > it halves the
> > > AIC as given by summary(multinom()). But when I call
> > Anova(multinom()) from the
> > > car package, it tells me that this added variable is
> insignificant
> >
> > > (Pr(>Chisq)=0.39). Thus, the improved AIC suggests to keep the
> > variable but the
> > > Anova suggests to drop it.
> > > >
> > > >I am sure this is due to my lack of understanding of these
> models
> > but could
> > > someone help me out with a pointer what my mistake is?
> > >
> > >
> > > I am not sure why you would expect the same answer from AIC and
>
> > p-value. They
> > > are different questions. AIC attempts to answer a question
> about
> > overall model
> > > fit. p-value for a particular variable attempts to answer
> whether
> > that
> > > particular coefficient could be due to chance if the population
>
> > value of the
> > > parameter was 0.
> > >
> > > One way these could give different answers is if the new
> variable
> > affected the
> > > parameter estimates for the other parameters.
> > >
> > > It's yet another exemplar of the problems with using p-values
> for
> > model
> > > selection
> > >
> > > HTH
> > >
> > > Peter
> > >
> > > Peter L. Flom, PhD
> > > Statistical Consultant
> > > www DOT peterflomconsulting DOT com
> >
> [[elided Yahoo spam]]
> >
> > That was very enlightening. I have to read up on model selection.
> The
> > thought I have to get my head around is that the added variable
> helps
> > explaining the observed variability in the data and thus should be
>
> > retained in the model. But since the coefficient is insignificant,
> I
> > cannot interpret it and if I use this equation for predictions then
> I
> > add a "random" value since I cannot reject that the coefficient is
>
> > actually zero instead of what I estimated.
> >
> > One just never sees someone presenting regression coefficients
> which
> > are not significant although model selection procedures are often
> > based on the AIC...
> >
> > Have a good weekend,
> > Werner
> >
> >
> >
> >
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> >
> > PLEASE do read the posting guide
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
>
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list