[R] AIC in R
Pierre Duchesne
duchesne at dms.umontreal.ca
Thu Sep 28 23:45:54 CEST 2006
Dear R users,
According Brockwell & Davis (1991, Section 9.3, p.304), the penalty term for
computing the AIC criteria is "p+q+1" in the context of a zero-mean
ARMA(p,q) time series model. They arrived at this criterion (with this
particular penalty term) estimating the Kullback-Leibler discrepancy index.
In practice, the user usually chooses the model whose estimated index is
minimum. Consequently, it seems that the theory and the interpretation are
only available in the case of a zero mean ARMA model, at least in the time
series context.
Concerning R, it seems that the penalty term is p+q+1 in a zero mean model,
and p+q+1+1 = p+q+2 for a ARMA(p,q) model with a constant term. See the
following examples:
--------------------------------------------------------------
set.seed(1)
serieAR1 = arima.sim(100,model=list(ar= 0.5))
fit1AR1 = arima(serieAR1, order = c(0, 0, 0), include.mean = T)
fit2AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = T)
fit3AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = F)
fit4AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = T)
fit5AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = F)
-2* fit1AR1$loglik + 2*(1+1)
fit1AR1$aic
-2* fit2AR1$loglik + 2*(1+1+1)
fit2AR1$aic
-2* fit3AR1$loglik + 2*(1+1)
fit3AR1$aic
-2* fit4AR1$loglik + 2*(1+1+1+1)
fit4AR1$aic
-2* fit5AR1$loglik + 2*(1+1+1)
fit5AR1$aic
> set.seed(1)
> serieAR1 = arima.sim(100,model=list(ar= 0.5))
>
> fit1AR1 = arima(serieAR1, order = c(0, 0, 0), include.mean = T)
> fit2AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = T)
> fit3AR1 = arima(serieAR1, order = c(1, 0, 0), include.mean = F)
> fit4AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = T)
> fit5AR1 = arima(serieAR1, order = c(1, 0, 1), include.mean = F)
>
> -2* fit1AR1$loglik + 2*(1+1)
[1] 297.4670
> fit1AR1$aic
[1] 297.4670
>
> -2* fit2AR1$loglik + 2*(1+1+1)
[1] 270.5381
> fit2AR1$aic
[1] 270.5381
>
> -2* fit3AR1$loglik + 2*(1+1)
[1] 270.6653
> fit3AR1$aic
[1] 270.6653
>
> -2* fit4AR1$loglik + 2*(1+1+1+1)
[1] 272.3530
> fit4AR1$aic
[1] 272.3530
>
> -2* fit5AR1$loglik + 2*(1+1+1)
[1] 272.5564
> fit5AR1$aic
[1] 272.5564
--------------------------------------------------------------
>From the help file of extractAIC(), it seems that the criterion used is:
AIC = - 2*log L + k * edf,
where L is the likelihood and 'edf' the equivalent degrees of freedom (i.e.,
the number of free parameters for usual parametric models) of 'fit'.
My question is: is there any justification for computing the AIC as done by
R when a constant term is in the model?
Your help will be appreciated.
Best regards,
Pierre
Note: for differenced time series (d > 1), the penalty term seems to be
p+q+1, and there is no constant term in the fit.
-------
Pierre Duchesne,
Département de mathématiques et statistique,
Université de Montréal,
CP 6128 Succ. Centre-Ville,
Montréal, Québec, Canada H3C 3J7.
More information about the R-help
mailing list