[R] Maximizing values in subsetted dataframe
Tim Clark
mudiver1200 at yahoo.com
Wed Jul 29 02:56:44 CEST 2009
Dear List,
I am trying to sub-sample some data by taking a data point every x minutes. The data contains missing values, and I would like to take the sub-sample that maximizes the number of valid points in the sample. I.e. minimizes the number of NA's in the data set.
For example, given the following:
da<-seq(Sys.time(),by=1,length.out=10)
x<-c(1,2,NA,4,NA,6,NA,8,9,10)
mydata<-data.frame(da,x)
If I wanted to take a subsample every 2 seconds, I would have the following two possible answers:
answer1: 2,4,NA,8
answer2: 1,NA,NA,7
I would like a function that would choose between these and obtain the one with the fewest missing values.
In my real dataset I have multiple variables collected every second and I would like to subsample it every 5, 10, and 15 minutes.
I appreciate your help.
Tim
Tim Clark
Department of Zoology
University of Hawaii
More information about the R-help
mailing list