[R] Anyone Familiar with Using arima function with exogenous variables?

Richard A. Bilonick rab at nauticom.net
Mon Apr 21 22:09:42 CEST 2003

Dirk Eddelbuettel wrote:

>Why don't you try simulation?  
>Create some data under the 'null' you're trying to get to, say, y <-
>seq(1,n) + arima.error where arima.error could be as simple as an AR(1) or
>MA(1). Then estimate the model, using 90% or 95% of the data and evaluate
>the forecast to the retained 10% or 5%. Repeat the DGP creation, estimation,
>forecast evaluation steps N (say 500) times and you should have a good idea
>about the merits of predict.arima.
I simulated a simple model:

y[t] - c = phi*(y[t-1] - c) + b*x[t] + e[t]

where the errors e are i.i.d. and normal. The coefficient of x is just 
b=1 (x is just an increasing linear trend). I used arima.sim to simulate 
y with phi=0.5. There was no intercept so c = 0. When I used arima to 
estimate the model coefficients, the estimates are very close to what 
you would expect. The coefficient for x was near 1, the intercept was 
very close to zero, and so forth. So my problem must be in making the 
forecast. I would have thought the forecast for the next time point 
would be:

y[t+1]' = c' + phi'*(y[t] - c') + b'*x[t+1]

where y[t+1]' is the forecast, c' is the estimated intercept, phi' is 
the estimated ar coefficient, and b' is the estimated coefficient for 
the exogenous variable.

So if I have 200 observations and I want to estimate for time t = 201, I 
would use y[200] and x[200] and I would have my forecast. But 
predict.Arima produces a different forecast (which looks more reasonable 
to me).

If I estimate a simple AR(2) model, the same method produces exactly the 
forecast given by predict.Arima.

I've studied predict.Arima. Unfortunately for my understanding, it uses 
KalmanForecast and I don't see the details. It passes the arma 
information to KalmanForecast, gets a prediction and to this adds the 
intercept and the product of the exogenous variable and corresponding 
estimated coefficient.

What is so different by just having one exogenous variable compared to 
just a simple AR(1)?

Rick B.

More information about the R-help mailing list