arima {stats}  R Documentation 
Fit an ARIMA model to a univariate time series.
arima(x, order = c(0L, 0L, 0L),
seasonal = list(order = c(0L, 0L, 0L), period = NA),
xreg = NULL, include.mean = TRUE,
transform.pars = TRUE,
fixed = NULL, init = NULL,
method = c("CSSML", "ML", "CSS"), n.cond,
SSinit = c("Gardner1980", "Rossignol2011"),
optim.method = "BFGS",
optim.control = list(), kappa = 1e6)
x 
a univariate time series 
order 
A specification of the nonseasonal part of the ARIMA
model: the three integer components 
seasonal 
A specification of the seasonal part of the ARIMA
model, plus the period (which defaults to 
xreg 
Optionally, a vector or matrix of external regressors,
which must have the same number of rows as 
include.mean 
Should the ARMA model include a mean/intercept term? The
default is 
transform.pars 
logical; if true, the AR parameters are
transformed to ensure that they remain in the region of
stationarity. Not used for 
fixed 
optional numeric vector of the same length as the total number of coefficients to be estimated. It should be of the form
where The entries of the The argument 
init 
optional numeric vector of initial parameter
values. Missing values will be filled in, by zeroes except for
regression coefficients. Values already specified in 
method 
fitting method: maximum likelihood or minimize conditional sumofsquares. The default (unless there are missing values) is to use conditionalsumofsquares to find starting values, then maximum likelihood. Can be abbreviated. 
n.cond 
only used if fitting by conditionalsumofsquares: the number of initial observations to ignore. It will be ignored if less than the maximum lag of an AR term. 
SSinit 
a string specifying the algorithm to compute the
statespace initialization of the likelihood; see

optim.method 
The value passed as the 
optim.control 
List of control parameters for 
kappa 
the prior variance (as a multiple of the innovations variance) for the past observations in a differenced model. Do not reduce this. 
Different definitions of ARMA models have different signs for the AR and/or MA coefficients. The definition used here has
X_t= a_1 X_{t1}+\cdots+ a_p X_{tp} + e_t + b_1 e_{t1}+\cdots+b_q e_{tq}
and so the MA coefficients differ in sign from those of SPLUS.
Further, if include.mean
is true (the default for an ARMA
model), this formula applies to X  m
rather than X
. For
ARIMA models with differencing, the differenced series follows a
zeromean ARMA model. If an xreg
term is included, a linear
regression (with a constant term if include.mean
is true and
there is no differencing) is fitted with an ARMA model for the error
term.
The variance matrix of the estimates is found from the Hessian of the loglikelihood, and so may only be a rough guide.
Optimization is done by optim
. It will work
best if the columns in xreg
are roughly scaled to zero mean
and unit variance, but does attempt to estimate suitable scalings.
A list of class "Arima"
with components:
coef 
a vector of AR, MA and regression coefficients, which can
be extracted by the 
sigma2 
the MLE of the innovations variance. 
var.coef 
the estimated variance matrix of the coefficients

loglik 
the maximized loglikelihood (of the differenced data), or the approximation to it used. 
arma 
A compact form of the specification, as a vector giving the number of AR, MA, seasonal AR and seasonal MA coefficients, plus the period and the number of nonseasonal and seasonal differences. 
aic 
the AIC value corresponding to the loglikelihood. Only
valid for 
residuals 
the fitted innovations. 
call 
the matched call. 
series 
the name of the series 
code 
the convergence value returned by 
n.cond 
the number of initial observations not used in the fitting. 
nobs 
the number of “used” observations for the fitting,
can also be extracted via 
model 
A list representing the Kalman Filter used in the
fitting. See 
The exact likelihood is computed via a statespace representation of
the ARIMA process, and the innovations and their variance found by a
Kalman filter. The initialization of the differenced ARMA process uses
stationarity and is based on Gardner et al (1980). For a
differenced process the nonstationary components are given a diffuse
prior (controlled by kappa
). Observations which are still
controlled by the diffuse prior (determined by having a Kalman gain of
at least 1e4
) are excluded from the likelihood calculations.
(This gives comparable results to arima0
in the absence
of missing values, when the observations excluded are precisely those
dropped by the differencing.)
Missing values are allowed, and are handled exactly in method "ML"
.
If transform.pars
is true, the optimization is done using an
alternative parametrization which is a variation on that suggested by
Jones (1980) and ensures that the model is stationary. For an AR(p)
model the parametrization is via the inverse tanh of the partial
autocorrelations: the same procedure is applied (separately) to the
AR and seasonal AR terms. The MA terms are not constrained to be
invertible during optimization, but they will be converted to
invertible form after optimization if transform.pars
is true.
Conditional sumofsquares is provided mainly for expositional
purposes. This computes the sum of squares of the fitted innovations
from observation n.cond
on, (where n.cond
is at least
the maximum lag of an AR term), treating all earlier innovations to
be zero. Argument n.cond
can be used to allow comparability
between different fits. The ‘part loglikelihood’ is the first
term, half the log of the estimated mean square. Missing values
are allowed, but will cause many of the innovations to be missing.
When regressors are specified, they are orthogonalized prior to fitting unless any of the coefficients is fixed. It can be helpful to roughly scale the regressors to zero mean and unit variance.
The results are likely to be different from SPLUS's
arima.mle
, which computes a conditional likelihood and does
not include a mean in the model. Further, the convention used by
arima.mle
reverses the signs of the MA coefficients.
arima
is very similar to arima0
for
ARMA models or for differenced models without missing values,
but handles differenced models with missing values exactly.
It is somewhat slower than arima0
, particularly for seasonally
differenced models.
Brockwell, P. J. and Davis, R. A. (1996). Introduction to Time Series and Forecasting. Springer, New York. Sections 3.3 and 8.3.
Durbin, J. and Koopman, S. J. (2001). Time Series Analysis by State Space Methods. Oxford University Press.
Gardner, G, Harvey, A. C. and Phillips, G. D. A. (1980). Algorithm AS 154: An algorithm for exact maximum likelihood estimation of autoregressivemoving average models by means of Kalman filtering. Applied Statistics, 29, 311–322. doi:10.2307/2346910.
Harvey, A. C. (1993). Time Series Models. 2nd Edition. Harvester Wheatsheaf. Sections 3.3 and 4.4.
Jones, R. H. (1980). Maximum likelihood fitting of ARMA models to time series with missing observations. Technometrics, 22, 389–395. doi:10.2307/1268324.
Ripley, B. D. (2002). “Time series in R 1.5.0”. R News, 2(2), 2–7. https://www.rproject.org/doc/Rnews/Rnews_20022.pdf
predict.Arima
, arima.sim
for simulating
from an ARIMA model, tsdiag
, arima0
,
ar
arima(lh, order = c(1,0,0))
arima(lh, order = c(3,0,0))
arima(lh, order = c(1,0,1))
arima(lh, order = c(3,0,0), method = "CSS")
arima(USAccDeaths, order = c(0,1,1), seasonal = list(order = c(0,1,1)))
arima(USAccDeaths, order = c(0,1,1), seasonal = list(order = c(0,1,1)),
method = "CSS") # drops first 13 observations.
# for a model with as few years as this, we want full ML
arima(LakeHuron, order = c(2,0,0), xreg = time(LakeHuron)  1920)
## presidents contains NAs
## graphs in example(acf) suggest order 1 or 3
require(graphics)
(fit1 < arima(presidents, c(1, 0, 0)))
nobs(fit1)
tsdiag(fit1)
(fit3 < arima(presidents, c(3, 0, 0))) # smaller AIC
tsdiag(fit3)
BIC(fit1, fit3)
## compare a whole set of models; BIC() would choose the smallest
AIC(fit1, arima(presidents, c(2,0,0)),
arima(presidents, c(2,0,1)), # < chosen (barely) by AIC
fit3, arima(presidents, c(3,0,1)))
## An example of using the 'fixed' argument:
## Note that the period of the seasonal component is taken to be
## frequency(presidents), i.e. 4.
(fitSfx < arima(presidents, order=c(2,0,1), seasonal=c(1,0,0),
fixed=c(NA, NA, 0.5, 0.1, 50), transform.pars=FALSE))
## The partlyfixed & smaller model seems better (as we "knew too much"):
AIC(fitSfx, arima(presidents, order=c(2,0,1), seasonal=c(1,0,0)))
## An example of ARIMA forecasting:
predict(fit3, 3)