[R] how to use function of rle approx ifelse etc. in data frame
Rosa
rosatan6 at gmail.com
Thu Aug 2 00:52:03 CEST 2012
Hello R help,
I have this data frame M2[160000,5] with NAs, a simple example would be:
set.seed(1234)
M2<-expand.grid(ID=182:183, year=2012, month=1:3, day=1:3,
KEEP.OUT.ATTRS=FALSE)
M2 <- M2[with(M2, order(ID, year, month, day)),] #sort the data
M2$value <- sample(c(NA, rnorm(100)), nrow(M2),
prob=c(0.5, rep(0.5/100, 100)), replace=TRUE)
M2:
ID year month day value
1 182 2012 1 1 -0.5012581
7 182 2012 1 2 1.1022975
13 182 2012 1 3 NA
3 182 2012 2 1 -0.1623095
9 182 2012 2 2 1.1022975
15 182 2012 2 3 -1.2519859
5 182 2012 3 1 NA
11 182 2012 3 2 NA
17 182 2012 3 3 NA
2 183 2012 1 1 0.9729168
8 183 2012 1 2 0.9594941
14 183 2012 1 3 NA
4 183 2012 2 1 NA
10 183 2012 2 2 -1.1088896
16 183 2012 2 3 0.9594941
6 183 2012 3 1 -0.4027320
12 183 2012 3 2 -0.0151383
18 183 2012 3 3 -1.0686427
In this example the max continuous NA is 3, while the data I have could have
more than 10 NAs, what I need to do is:
1, split the data according to ID, year and month;
2, in each subset, if there are less than 5 continuous NA, repeat the prior
data; if there are 5-10 NA, do a linear interpolation; and if there are more
than 10 NA, delete the whole month;
3, if the first day of the month is NA, use the function backward.
So far thanks to sebastian-c, the part of more than 10 NA is done:
library(zoo)
NA_run <- function(x, maxlen){
runs <- rle(is.na(x$value))
if(any(runs$lengths[runs$values] >= maxlen)) NULL else x
}
library(plyr)
rem <- ddply(M2, .(ID, year, month), NA_run, 10)
As to the other two parts, I figured out if less than 5 NA, use:
na.locf(rem$value, na.rm=FALSE, maxgap=5); and if 5<NA<10,
use:approx(rem$value, n=length(rem$value))$y; however when I put them into
if else, it keeps failing me, is it because it is in data frame? I checked
many posts on this issue, but doesn't work on mine, any help would be
appreciated, thanks.
--
View this message in context: http://r.789695.n4.nabble.com/how-to-use-function-of-rle-approx-ifelse-etc-in-data-frame-tp4638778.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list