[RsR] Time-series analysis + robust statistics?

Ajay Shah @j@y@h@h @end|ng |rom m@y|n@org
Wed Jul 23 10:32:12 CEST 2008


I have a situation where there are influential observations in a
time-series. I'd like to identify and estimate ARMA models, and use
these for forecasting.

For simplicity, suppose we only try to do AR models. I use the R
function ar() which picks the lag order which has the lowest AIC. In
this case, influential observations hurt in two ways: I could be
picking the wrong lag order, and once the lag order is chosen, the
estimates could be contaminated.

How would one do better? OLS is a consistent estimator for the AR
model so given a lag order, I could use robust::lmRob() for the
estimation itself. But what's the strategy for choice of lag order in
such a setting?

Here's an example of what I'm facing. First, here's a data vector `x':

> dput(x)
c(79.9440288884057, 7.13067112046382, 16.5846998449958,
5.75840705300372, 8.09926965727001, 3.01384396246966,
11.0732434661792, 7.85954318513653, 17.2736237329392,
20.3932706296264, 7.62788437568602, 3.51703345513918,
15.2513940236837, 7.95111256035277, 2.9356001527546, 5.21640800341245,
1.31803867968863, 2.55515665315933, 5.13594826475305,
5.71804227249366, -1.12212247879953, 2.98172576960702,
2.27752210573477, 3.44436191014204, 4.2577789817301, 7.04679779929016,
3.73716953022729, 13.2253513389983, 9.2627049946394, 3.72144768484013,
-1.61454965523937, 5.2247551348966, 7.13715676537241,
3.02007539508224, 5.28341775873891, 2.90561296982048,
8.71618832752858, -1.35157689607084, 3.21088667542959,
-2.09878528959777, 5.55102176907134, 8.05388328712411,
3.86137275447247, 2.15562523389998, 7.85247493631722,
15.3556464279099, -4.55059942197948, 3.78768707731680,
11.8018662795635, 11.2682042854725, 11.2209559360934, 7.06058834054737)

The 1st observation -- 79.94 -- is what worries me. If I say:

> ar(x)

Call:
ar(x = x)


Order selected 0  sigma^2 estimated as  130.5 

The AIC-best lag order is 0. But if I drop this 1st observation, I get:

> ar(x[-1])

Call:
ar(x = x[-1])

Coefficients:
     1  
0.2167  

Order selected 1  sigma^2 estimated as  25.72 



In this example, it seems easy to say: "Just drop this 1st observation
and proceed". But this is just an example. I want a general procedure
which can be written into a program. In short, what's a practical and
sensible approach to doing ARMA models with noisy data?

-- 
Ajay Shah                                      http://www.mayin.org/ajayshah  
ajayshah using mayin.org                             http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.




More information about the R-SIG-Robust mailing list