[R] Time Series and Auto.arima

Lorenzo Isella lorenzo.isella at gmail.com
Sun Jan 31 20:05:19 CET 2016


Partially the trouble is that the zoo time series is then translated
into a ts object by auto.arima.
In doing so, the series along a regular time grid and some missing
data appear.
To fix this, I should replace each NA with the previous non-NA value.
This is easy enough and the series exhibits some clear cycles: roughly
every month there is a spike, followed by a decrease, then another
spike and so on.
I would like to forecast a couple of cycles (60 steps), but when I do
so with auto.arima, nothing like what I expect appears (the
seasonality is completely lost).
Any idea why?
I paste below the revised R code for reproducibility.

Lorenzo





library(forecast)

tt<-structure(c(1494.5, 1367.57, 1357.57, 1222.23, 1124.02, 1011.64,
4575.64, 3201.87, 3050.04, 2173.38, 1967.88, 1838.55, 1666.05,
1656.05, 1524.96, 835.96, 775.36, 592.36, 494.15, 4058.15, 2624.36,
2448.47, 1598.47, 1398.47, 1264.14, 1165.88, 1053.67, 941.36,
821.36, 471.36, 373.15, 259.91, 3808.91, 2262.26, 1940.39, 1011.39,
800.81, 790.81), index = structure(c(16563L, 16565L, 16570L,
16572L, 16577L, 16579L, 16584L, 16585L, 16586L, 16587L, 16588L,
16589L, 16590L, 16592L, 16593L, 16599L, 16606L, 16607L, 16608L,
16612L, 16613L, 16614L, 16617L, 16618L, 16619L, 16620L, 16621L,
16628L, 16633L, 16635L, 16638L, 16642L, 16647L, 16648L, 16649L,
16650L, 16651L, 16654L), class = "Date"), class = "zoo")

tt2<-as.ts(tt)
tt2<-na.locf(tt2)

mm<-auto.arima(tt2)


plot(forecast(mm, h=60))




On Fri, Jan 29, 2016 at 02:16:27PM -0800, David Winsemius wrote:
>
>> On Jan 29, 2016, at 12:59 PM, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
>>
>> Dear All,
>> I am puzzled and probably I am misunderstanding something.
>> Please consider the snippet at the end of the email.
>> We see a time series that has clearly some pattern (essentially, it is
>> an account where a salary is regularly paid followed by some
>> expenses).
>> However the output of the auto.arima from the forecast function does
>> not seem to make any sense (at least to me).
>> I wonder if the problem is the fact that the time series is not
>> defined at regular intervals.
>> Any suggestions and alternative ways to fit it (e.g.: sarima from the astsa
>> package to account for the seasonality?) are really welcome.
>> Many thanks
>>
>> Lorenzo
>>
>>
>>
>> ##############################################
>> library(forecast)
>>
>> tt<-structure(c(1494.5, 1367.57, 1357.57, 1222.23, 1124.02, 1011.64,
>> 4575.64, 3201.87, 3050.04, 2173.38, 1967.88, 1838.55, 1666.05,
>> 1656.05, 1524.96, 835.96, 775.36, 592.36, 494.15, 4058.15, 2624.36,
>> 2448.47, 1598.47, 1398.47, 1264.14, 1165.88, 1053.67, 941.36,
>> 821.36, 471.36, 373.15, 259.91, 3808.91, 2262.26, 1940.39, 1011.39,
>> 800.81, 790.81), index = structure(c(16563L, 16565L, 16570L,
>> 16572L, 16577L, 16579L, 16584L, 16585L, 16586L, 16587L, 16588L,
>> 16589L, 16590L, 16592L, 16593L, 16599L, 16606L, 16607L, 16608L,
>> 16612L, 16613L, 16614L, 16617L, 16618L, 16619L, 16620L, 16621L,
>> 16628L, 16633L, 16635L, 16638L, 16642L, 16647L, 16648L, 16649L,
>> 16650L, 16651L, 16654L), class = "Date"), class = "zoo")
>>
>> plot(tt)
>>
>
>library(forecast)
>
>> fit<-auto.arima(tt)
>>
>> ###########################################
>
>If , after runing plot(tt), you then run:
>
> fitted(fit)
>
>Time Series:
>Start = 16563
>End = 16654
>Frequency = 1
> [1] 1448.8211        NA 1444.8612        NA        NA        NA        NA
> [8] 1398.7752        NA 1359.0350        NA        NA        NA        NA
>[15] 1309.1398        NA 1219.7420        NA        NA        NA        NA
>[22] 2302.8903 3708.1762 2713.0349 2603.0512 1968.0100 1819.1484 1725.4634
>[29]        NA 1572.6179 1593.2628        NA        NA        NA        NA
>[36]        NA 1258.3403        NA        NA        NA        NA        NA
>[43]        NA 1184.9656  955.3023  822.7394        NA        NA        NA
>[50] 1987.7634 3333.3131 2294.6941        NA        NA 1760.6351 1551.5526
>[57] 1406.6751 1309.3682 1238.1899        NA        NA        NA        NA
>[64]        NA        NA 1251.6898        NA        NA        NA        NA
>[71] 1179.9970        NA  988.3885        NA        NA  888.4533        NA
>[78]        NA        NA  889.4017        NA        NA        NA        NA
>[85] 1970.0911 3152.7668 2032.3935 1799.2350 1126.2794        NA        NA
>[92] 1088.1525
>
>
>Using that vector:
>
>lines(seq(16563 ,16654 ),fitted(fit), col="red", lwd=3)
>
>You can see that the fitted values are capturing quite a bit of the variation.
>
>
>
>I'm not a regular user of pkg:forecast, so there may be more refined methods of extracting information than using `fitted`.
>
>-- 
>
>David Winsemius
>Alameda, CA, USA
>



More information about the R-help mailing list