[R] Can anybody help me understand AIC and BIC and devise a new metric?

Stephan Kolassa Stephan.Kolassa at gmx.de
Mon Jul 12 18:19:58 CEST 2010


Hi,

one comment: Claeskens and Hjort define AIC as 2*log L - 2*p for a model
with likelihood L and p parameters; consequently, they look for models
with *maximum* AIC in model selection and averaging. This differs from
the vast majority of authors (and R), who define AIC as -2*log L + 2*p
and search for the model with *minimum* AIC. Their definition of BIC is 
similarly the negative of "normal" BIC.

I would compare this to defining \pi as the base of the natural
logarithm and e as the ratio of a circle's circumference to its
diameter: of course, you can do perfectly valid mathematics with your
own definitions, but it is a recipe for confusion.

Anyone who only reads Claeskens and Hjort, fires up R and selects the
model with the maximum AIC from the candidate models is in for some
*nasty* surprises.

Worse, as far as I see, Claeskens and Hjort nowhere mention that they
are using a definition that is diametrically opposed to what is
(overwhelmingly) common, and they do not comment on this.

However, Claeskens and Hjort managed to publish a book, which I have yet
to do, so it is quite possible that there is a major flaw in my
thinking. If so, I haven't found it yet, and I would be very grateful if
somebody pointed out what I misunderstand.

Otherwise, I would be *very* careful indeed about basing my analysis 
strategy on their book, although the rest of the content is very helpful 
indeed - you only need to remember where to switch signs and change 
"maximize" to "minimize" etc.

For AIC and BIC novices, I would recommend going with Burnham &
Anderson, which Kjetil cited below.

Best,
Stephan



Kjetil Halvorsen schrieb:
> You should have a look at:
> 
> "Model Selection and
> Model Averaging"
> Gerda Claeskens
> K.U. Leuven
> Nils Lid Hjort
> University of Oslo
> 
> Among other this will explain that AIC and BIC really aims at different goals.
> 
> On Mon, Jul 5, 2010 at 4:20 PM, Dennis Murphy <djmuser at gmail.com> wrote:
>> Hi:
>>
>> On Mon, Jul 5, 2010 at 7:35 AM, LosemindL <comtech.usa at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> Could anybody please help me understand AIC and BIC and especially why do
>>> they make sense?
>>>
>> Any good text that discusses model selection in detail will have some
>> discussion of
>> AIC and BIC. Frank Harrell's book 'Regression Modeling Strategies' comes
>> immediately
>> to mind, along with Hastie, Tibshirani and Friedman (Elements of Statistical
>> Learning)
>> and Burnham and Anderson's book (Model Selection and Multi-Model Inference),
>> but
>> there are many other worthy texts that cover the topic. The gist is that AIC
>> and BIC
>> penalize the log likelihood of a model by subtracting different functions of
>> its number
>> of parameters. David's suggestion of Wikipedia is also on target.
>>
>>> Furthermore, I am trying to devise a new metric related to the model
>>> selection in the financial asset management industry.
>>>
>>> As you know the industry uses Sharpe Ratio as the main performance
>>> benchmark, which is the annualized mean of returns divided by the
>>> annualized
>>> standard deviation of returns.
>>>
>> I didn't know, but thank you for the information. Isn't this simply a
>> signal-to-noise
>> ratio quantified on an annual basis?
>>
>>> In model selection, we would like to choose a model that yields the highest
>>> Sharpe Ratio.
>>>
>>> However, the more parameters you use, the higher Sharpe Ratio you might
>>> potentially get, and the higher risk that your model is overfitted.
>>>
>>> I am trying to think of a AIC or BIC version of the Sharpe Ratio that
>>> facilitates the model selection...
>>>
>> You might be able to make some progress if you can express the (penalized)
>> log likelihood as a function of the Sharpe ratio. But if you have several
>> years of
>> data in your model and the ratio is computed annually, then isn't it a
>> random
>> variable rather than a parameter? If so, it changes the nature of the
>> problem, no?
>> (Being unfamiliar with the Sharpe ratio, I fully recognize that I may be
>> completely
>> off-base in this suggestion, but I'll put it out there anyway :)
>>
>> BTW, you might find the R-sig-finance list to be a more productive resource
>> in
>> this problem than R-help due to the specialized nature of the question.
>>
>> HTH,
>> Dennis
>>
>>> Anybody could you please give me some pointers?
>>>
>>> Thanks a lot!
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/Can-anybody-help-me-understand-AIC-and-BIC-and-devise-a-new-metric-tp2278448p2278448.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list