[R] Interpolating / smoothing missing time series data
Spencer Graves
spencer.graves at pdf.com
Thu Sep 8 18:59:41 CEST 2005
(see inline)
Sean Davis wrote:
> On 9/7/05 10:19 PM, "Gabor Grothendieck" <ggrothendieck at gmail.com> wrote:
>
>
>>On 9/7/05, David James <djames at frontierassoc.com> wrote:
>>
>>>The purpose of this email is to ask for pre-built procedures or
>>>techniques for smoothing and interpolating missing time series data.
>>>
>>>I've made some headway on my problem in my spare time. I started
>>>with an irregular time series with lots of missing data. It even had
>>>duplicated data. Thanks to zoo, I've cleaned that up -- now I have a
>>>regular time series with lots of NA's.
>>>
>>>I want to use a regression model (i.e. ARIMA) to ill in the gaps. I
>>>am certainly open to other suggestions, especially if they are easy
>>>to implement.
>>>
>>>My specific questions:
>>>1. Presumably, once I get ARIMA working, I still have the problem of
>>>predicting the past missing values -- I've only seen examples of
>>>predicting into the future.
>>>2. When predicting the past (backcasting), I also want to take
>>>reasonable steps to make the data look smooth.
>>>
>>>I guess I'm looking for a really good example in a textbook or white
>>>paper (or just an R guru with some experience in this area) that can
>>>offer some guidance.
>>>
>>>Venables and Ripley was a great start (Modern Applied Statistics with
>>>S). I really had hoped that the "Seasonal ARIMA Models" section on
>>>page 405 would help. It was helpful, but only to a point. I have a
>>>hunch (based on me crashing arima numerous times -- maybe I'm just
>>>new to this and doing things that are unreasonable?) that using
>>>hourly data just does not mesh well with the seasonal arima code?
>>
Have you looked at Durbin, J. and Koopman, S. J. (2001) _Time Series
Analysis by State Space Methods._ Oxford University Press, cited with
"?arima"? They explain that Kalman filtering is predicting the future,
while Kalman smoothing is using all the data to fill the gaps, which
seems to match your question. I was able to reproduce Figure 2.1 in
that book but got bogged down with Figure 2.2 before I dropped the
project. I can send you the script file I developed when working on
that if it would help you.
I'm still interested in learning how to reproduce in R all the
examples in that book, and I'd happily receive suggestions from others
on how to do that.
spencer graves
>>Not sure if this answers your question but if you are looking for something
>>simple then na.approx in the zoo package will linearly interpolate for you.
>>
>>
>>>z <- zoo(c(1,2,NA,4,5))
>>>na.approx(z)
>>
>>1 2 3 4 5
>>1 2 3 4 5
>
>
> Alternatively, if you are looking for "more smoothing", you could look at
> using a moving average or median applied at points of interest with an
> "appropriate" window size--see wapply in the gplots package (gregmisc
> bundle). There are a number of other functions that can accomplish the same
> task. A search for "moving window" or "moving average" in the archives may
> produce some other ideas.
>
> Sean
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA
spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel: 408-938-4420
Fax: 408-280-7915
More information about the R-help
mailing list