[R] how to interpolate time series data with missingness

Gabor Grothendieck ggrothendieck at gmail.com
Thu Jun 18 00:07:06 CEST 2009


The zoo package has a number of na.* routines:

> library(zoo)
> x <- c(2,3,NA,NA,NA,3.2,3.5,NA,NA,6,NA)
> na.approx(x)
 [1] 2.000000 3.000000 3.050000 3.100000 3.150000 3.200000 3.500000 4.333333
 [9] 5.166667 6.000000
> na.locf(x)
 [1] 2.0 3.0 3.0 3.0 3.0 3.2 3.5 3.5 3.5 6.0 6.0
> na.spline(x)
 [1] 2.000000 3.000000 3.366531 3.352065 3.211566 3.200000 3.500000 4.045127
 [9] 4.857627 6.000000 7.534746


On Wed, Jun 17, 2009 at 5:54 PM, Matthew Keller<mckellercran at gmail.com> wrote:
> Hi all,
>
> I have a vector, most of which is missing. The data is always
> increasing, but may do so in jumps. I would like to interpolate the
> NAs with 'best guesses', using something like filter(), which doesn't
> work due to the NAs. Here is an example:
>
>> x <- c(2,3,NA,NA,NA,3.2,3.5,NA,NA,6,NA)
>> x
>  [1] 2.0 3.0  NA  NA  NA 3.2 3.5  NA  NA 6.0  NA
>
> I would like a function that would take the NAs and fill in the
> average values around the NAs. E.g., make a new vector x.new that
> looks like:
>> x.new
> [1] 2.0 3.0 3.1 3.1 3.1 3.2 3.5 4.75 4.75 6 6
>
> Or, alternatively, that could figure out a more likely value than just
> the average. There must be something simple I'm overlooking, like some
> kind of loess y-hat or something? Any help would be appreciated,
>
> Matt
>
> --
> Matthew C Keller
> Asst. Professor of Psychology
> University of Colorado at Boulder
> www.matthewckeller.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list