[R] filling small gaps of N/A

Wed Apr 4 09:18:02 CEST 2012

> 
> Michael,
> 
> First of all, thank you very much for your answer.
> I've read your 2 answers, but I'm not really sure that they corresponds 
to
> my problem of NAs.

You shall read answers more carefully

x<-rnorm(20)
x[3:4]<-NA
x[12:19]<-NA
x
 [1] -0.30754528  0.07597988          NA          NA -0.50585319 
-1.60509616  0.31488672  2.16969731
 [9]  0.67755514 -1.83075111  0.72044482          NA          NA NA   NA   
    NA
[17]          NA          NA          NA -0.96576934

library(zoo)

na.approx(x)
 [1] -0.30754528  0.07597988 -0.11796447 -0.31190883 -0.50585319 
-1.60509616  0.31488672  2.16969731
 [9]  0.67755514 -1.83075111  0.72044482  0.53308769  0.34573056 
0.15837343 -0.02898370 -0.21634083
[17] -0.40369795 -0.59105508 -0.77841221 -0.96576934
na.approx(x, maxgap=3)
 [1] -0.30754528  0.07597988 -0.11796447 -0.31190883 -0.50585319 
-1.60509616  0.31488672  2.16969731
 [9]  0.67755514 -1.83075111  0.72044482          NA          NA NA   NA   
    NA
[17]          NA          NA          NA -0.96576934

Does exactly what you want as far as I understand what you described.

Regards
Petr

> I'll try to detail you a bit more.
> 
> This problem concerns the second part of my program. In the first part, 
I've
> already created a timeseries object with the library (timeseries). I had 
to
> delete first all the wrong values in my data and replace it with NAs. 
> So my data contains already missing data (NAs), as I have cleaned it 
before.
> 
> The thing is that sometimes I have small gaps of missing data (only 2 or 
3
> following) like in "example 1" below:
> 
> example 1:
> 
> 09/01/2008 12:00      1.93 
> 09/01/2008 12:15      3.93 
> 09/01/2008 12:30       NA            So here you have a small gap with 
only
> 2 NAs
> 09/01/2008 12:45       NA 
> 09/01/2008 13:00      4.93 
> 09/01/2008 13:15      5.93
> 
> But sometimes, always in the same file, I have big gaps, such as 10 or 
more
> NAs following each other like in "example 2" below:
> 
> example 2:
> 
> 09/01/2008 16:15      2.93
> 09/01/2008 16:30      2.93
> 09/01/2008 16:45      NA
> 09/01/2008 17:00      NA
> 09/01/2008 17:15      NA
> 09/01/2008 17:30      NA
> 09/01/2008 17:45      NA
> 09/01/2008 18:00      NA          So here you have a big gap with more 
than 10
> NAs following each other
> 09/01/2008 18:15      NA
> 09/01/2008 18:30      NA
> 09/01/2008 18:45      NA
> 09/01/2008 19:00      NA
> 09/01/2008 19:15      NA
> 09/01/2008 19:30      NA
> 09/01/2008 19:45      NA
> 09/01/2008 20:00      NA
> 09/01/2008 20:15      7.93
> 09/01/2008 20:30      7.93
> 
> So in the whole same file, I can have sometimes big gaps (2 or 3 NAs),
> sometimes big or very big gaps (10 or 100 NAs following).
> 
> The aim of my problem is to apply the function: na.approx(x) of the 
library
> (zoo) to fill NAs ONLY for small gaps.
> 
> If I just do: apply(na.approx(x)), it will fill all the NAs of my data 
(big
> gaps + small gaps). It's exactly what I DON'T WANT.
> 
> My problem is to say to R: " you apply the function (na.approx) to fill 
NAs
> ONLY if you see 4 NAs maximum following each other (small gaps) (like
> example 1)". "If you see more than 4 NAs following each other (big gaps 
like
> in example 2), you keep these NAs and you DON'T fill this big gap".
> 
> My question is: how can I say this to R? I don't know how to do it.
> Hope I've been understandable this time ^^
> Thanks a lot again for all your answers!
> 
> 
> 
> --
> View this message in context: 
http://r.789695.n4.nabble.com/filling-small-
> gaps-of-N-A-tp4528184p4528907.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.