[R] AIC in R

Pierre Duchesne duchesne at dms.umontreal.ca
Thu Sep 28 23:45:54 CEST 2006

```Dear R users,

According Brockwell & Davis (1991, Section 9.3, p.304), the penalty term for
computing the AIC criteria is "p+q+1" in the context of a zero-mean
ARMA(p,q) time series model.  They arrived at this criterion (with this
particular penalty term) estimating the Kullback-Leibler discrepancy index.
In practice, the user usually chooses the model whose estimated index is
minimum.  Consequently, it seems that the theory and the interpretation are
only available in the case of a zero mean ARMA model, at least in the time
series context.

Concerning R, it seems that the penalty term is p+q+1 in a zero mean model,
and p+q+1+1 = p+q+2 for a ARMA(p,q) model with a constant term.  See the
following examples:

--------------------------------------------------------------
set.seed(1)
serieAR1 = arima.sim(100,model=list(ar= 0.5))

fit1AR1 = arima(serieAR1, order = c(0, 0, 0), include.mean = T)
fit2AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = T)
fit3AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = F)
fit4AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = T)
fit5AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = F)

-2* fit1AR1\$loglik + 2*(1+1)
fit1AR1\$aic

-2* fit2AR1\$loglik + 2*(1+1+1)
fit2AR1\$aic

-2* fit3AR1\$loglik + 2*(1+1)
fit3AR1\$aic

-2* fit4AR1\$loglik + 2*(1+1+1+1)
fit4AR1\$aic

-2* fit5AR1\$loglik + 2*(1+1+1)
fit5AR1\$aic

> set.seed(1)
> serieAR1 = arima.sim(100,model=list(ar= 0.5))
>
> fit1AR1 = arima(serieAR1, order = c(0, 0, 0), include.mean = T)
> fit2AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = T)
> fit3AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = F)
> fit4AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = T)
> fit5AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = F)
>
> -2* fit1AR1\$loglik + 2*(1+1)
 297.4670
> fit1AR1\$aic
 297.4670
>
> -2* fit2AR1\$loglik + 2*(1+1+1)
 270.5381
> fit2AR1\$aic
 270.5381
>
> -2* fit3AR1\$loglik + 2*(1+1)
 270.6653
> fit3AR1\$aic
 270.6653
>
> -2* fit4AR1\$loglik + 2*(1+1+1+1)
 272.3530
> fit4AR1\$aic
 272.3530
>
> -2* fit5AR1\$loglik + 2*(1+1+1)
 272.5564
> fit5AR1\$aic
 272.5564
--------------------------------------------------------------

>From the help file of extractAIC(), it seems that the criterion used is:
AIC = - 2*log L +  k * edf,
where L is the likelihood and 'edf' the equivalent degrees of freedom (i.e.,
the number of free parameters for usual parametric models) of 'fit'.

My question is: is there any justification for computing the AIC as done by
R when a constant term is in the model?

Best regards,
Pierre

Note: for differenced time series (d > 1), the penalty term seems to be
p+q+1, and there is no constant term in the fit.

-------
Pierre Duchesne,
Département de mathématiques et statistique,
Université de Montréal,
CP 6128 Succ. Centre-Ville,