[R] filling small gaps of N/A
R. Michael Weylandt
michael.weylandt at gmail.com
Tue Apr 3 15:26:20 CEST 2012
Sorry -- left out a major detail: most of these functions have maxgap
arguments which allow you to leave larger gaps of NAs as NAs.
Best,
Michael
On Tue, Apr 3, 2012 at 9:24 AM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
> It seems like you could benefit from using a zoo [time series] object
> to hold your data -- then you have a variety of NA filling functions
> which work for arbitrarily long gaps. E.g.,
>
> library(zoo)
> x <- zoo(1:100, Sys.Date() + 1:100)
> x[2:60] <- NA
>
> # Most of these look the same because the data is simple: will give
> different results for more complicated examples
> na.approx(x)
> na.locf(x)
> na.spline(x)
> na.aggregate(x)
> na.fill # Takes more arguments
>
> Hope this helps,
> Michael
>
> On Tue, Apr 3, 2012 at 4:52 AM, jeff6868
> <geoffrey_klein at etu.u-bourgogne.fr> wrote:
>> Hi everybody,
>>
>> I'm a new R french user. Sorry if my english is not perfect. Hope you'll
>> understand my problem ;)
>>
>> I have to work on temperature data (35000 lines in one file) containing some
>> missing data (N/A). Sometimes I have only 2 or 3 N/A following each other,
>> but I have also sometimes 100 or 200 N/A following each other. Here's an
>> example of my data, when I have only small gaps of missing data (2 or 3
>> N/A):
>>
>> 09/01/2008 12:00 2 1.93 2.93 4.56 5.43
>> 09/01/2008 12:15 2 *3.93* 3.25 4.93 5.56
>> 09/01/2008 12:30 2 NA 3.5 5.06 5.56
>> 09/01/2008 12:45 2 NA 3.68 5.25 5.68
>> 09/01/2008 13:00 2 *4.93 * 3.87 5.56 5.93
>> 09/01/2008 13:15 2 5.93 4.25 5.75 6.06
>> 09/01/2008 13:30 2 3.93 4.56 5.93 6.18
>>
>> My question is: how can I replace these small gaps of N/A by numeric values?
>> I would like a fonction which only replace the small gaps (2 or 3 N/A) in my
>> data, but not the big gaps (more than 5 N/A following each other).
>>
>> For the moment, i'm trying to do it by working with the time gap between the
>> 2 numeric values surrounding the N/A as following:
>>
>> imputation <- function(x){
>> met = NULL
>>
>> temp <- met[1] <- x[1]
>>
>> ind_temp <- 1
>>
>> tps <- time(x)
>>
>> for (i in 2:(length(x)) ){
>> if((tps[i]-tps[ind_temp] > 1)&(tps[i]-tps[ind_temp] <=
>> 4)&(is.na(x[i]))){
>> met[i] <- na.approx(x)
>> }
>> else {
>> temp <- met[i] <- x[i]
>> ind_temp <- i
>> }
>> }
>>
>> return(met)
>> }
>>
>> In this example, I would like to apply the function: na.approx(x) on my N/A,
>> but only when I have maximum 4 N/A following each other.
>> There's no error, but it doesn't work (it was working in the other way, when
>> I had to detect aberrant data and replace it by N/A, but not now). It is
>> maybe not the good way to solve this problem. I don't have a lot of
>> experience in R. Maybe there is an easier way to do it...
>> Does somebody have an idea about it for helping me?
>> Thanks a lot!
>>
>>
>> --
>> View this message in context: http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list