[R] identify data points by certain criteria
arun
smartpink111 at yahoo.com
Thu Jun 13 22:15:24 CEST 2013
Hi,
May be this helps:
source("Ye_data.txt")
dim(dat1)
#[1] 44640 3
library(xts)
xt1<- xts(dat1[,-1],strptime(dat1[,1],"%m/%d/%Y %H:%M"))
xtSub<-xt1["T00:00:00/T08:00:00"]
dim(xt1)
#[1] 44640 2
dim(xtSub)
#[1] 14911 2
lst1<-split(xtSub,as.Date(index(xtSub)))
sapply(lst1,function(x) {indx<- which(rowSums(x)==0);indx1<-which.max(c(1,diff(index(x)[indx]))) })
#2012-12-01 2012-12-02 2012-12-03 2012-12-04 2012-12-05 2012-12-06 2012-12-07
# 373 41 262 268 266 254 274
#2012-12-08 2012-12-09 2012-12-10 2012-12-11 2012-12-12 2012-12-13 2012-12-14
# 109 1 323 264 279 353 265
#2012-12-15 2012-12-16 2012-12-17 2012-12-18 2012-12-19 2012-12-20 2012-12-21
# 327 226 264 269 271 267 276
#2012-12-22 2012-12-23 2012-12-24 2012-12-25 2012-12-26 2012-12-27 2012-12-28
# 360 162 222 81 231 143 364
#2012-12-29 2012-12-30 2012-12-31
# 122 399 418
lst2<-lapply(lst1,function(x) {indx<- which(rowSums(x)==0);indx1<-which.max(c(1,diff(index(x)[indx])));index(x)[indx1] })
lst2[1:3]
#$`2012-12-01`
#[1] "2012-12-01 06:12:00 EST"
#
#$`2012-12-02`
#[1] "2012-12-02 00:40:00 EST"
#
#$`2012-12-03`
#[1] "2012-12-03 04:21:00 EST"
A.K.
________________________________
From: Ye Lin <yelin at lbl.gov>
To: arun <smartpink111 at yahoo.com>
Sent: Thursday, June 13, 2013 1:11 PM
Subject: Re: [R] identify data points by certain criteria
hey Arun,
Sorry about the confusion. My intention to apply a simple sample is to simply the question and I can self-educate/modify on the code you provided and apply to my real data.
Here is how my real data looks like. It is 1 min data for entire month. I will focus on the time period from 0:00-08:00 everyday ( from midnight to 8am) and try to find out the timestamp meets the criteria I mentioned before.
Thanks for your help!
Ye
On Thu, Jun 13, 2013 at 9:57 AM, arun <smartpink111 at yahoo.com> wrote:
>
>
>HI Ye,
>Could you provide an example that mimic your real dataset? Because if I spend some time on this and it is not the case, then it is a waste of time.
>
>
>
>
>________________________________
>From: Ye Lin <yelin at lbl.gov>
>To: arun <smartpink111 at yahoo.com>
>Sent: Thursday, June 13, 2013 12:54 PM
>
>Subject: Re: [R] identify data points by certain criteria
>
>
>
>oh~sorry~
>
>its gonna be from 00:00-23:59 ~ 1 day range
>
>
>
>On Thu, Jun 13, 2013 at 9:42 AM, arun <smartpink111 at yahoo.com> wrote:
>
>
>>
>>Hi,
>>
>>I was talking about the timestamp itself. I don't know the range of your timestamp.
>>
>>
>>indx[which.max(c(1,diff(as.numeric(gsub(".*:","",dat1[,1][indx])))))]
>>#[1] 10
>>
>> dat1[indx[which.max(c(1,diff(as.numeric(gsub(".*:","",dat1[,1][indx])))))],]
>># Time Var1 Var2
>>#10 00:09 0 0
>>A.K.
>>
>>________________________________
>>From: Ye Lin <yelin at lbl.gov>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Thursday, June 13, 2013 12:00 PM
>>Subject: Re: [R] identify data points by certain criteria
>>
>>
>>
>>
>>Basically what I am trying to do is to find out the first timestamp that meets the criteria, in other words "when does it happen"
>>
>>
>>
>>On Wed, Jun 12, 2013 at 6:29 PM, arun <smartpink111 at yahoo.com> wrote:
>>
>>Hi,
>>>Not clear about the 'Time' column.
>>>dat1<- read.table(text="
>>>
>>>Time Var1 Var2
>>>00:00 1 0
>>>00:01 0 0
>>>00:02 1 0
>>>00:03 1 0
>>>00:04 0 0
>>>00:05 1 0
>>>00:06 1 0
>>>00:07 1 0
>>>00:08 1 0
>>>00:09 0 0
>>>00:10 1 0
>>>00:11 1 0
>>>00:12 1 0
>>>00:13 0 0
>>>",sep="",header=TRUE,stringsAsFactors=FALSE)
>>>
>>>
>>>indx<-which(rowSums(dat1[,-1])==0)
>>>dat1[indx[which.max(c(1,diff(as.numeric(gsub(".*:","",dat1[,1][indx])))))],]
>>># Time Var1 Var2
>>>#10 00:09 0 0
>>>dat1[indx[which.max(c(1,diff(as.numeric(gsub(".*:","",dat1[,1][indx])))))],"Time"]
>>>#[1] "00:09"
>>>
>>>
>>>A.K.
>>>
>>>
>>>
>>>
>>>----- Original Message -----
>>>From: Ye Lin <yelin at lbl.gov>
>>>To: R help <r-help at r-project.org>
>>>Cc:
>>>Sent: Wednesday, June 12, 2013 8:55 PM
>>>Subject: [R] identify data points by certain criteria
>>>
>>>Hey I want to identify data points by criteria, here is an example of my
>>>1min data
>>>
>>>Time Var1 Var2
>>>00:00 1 0
>>>00:01 0 0
>>>00:02 1 0
>>>00:03 1 0
>>>00:04 0 0
>>>00:05 1 0
>>>00:06 1 0
>>>00:07 1 0
>>>00:08 1 0
>>>00:09 0 0
>>>00:10 1 0
>>>00:11 1 0
>>>00:12 1 0
>>>00:13 0 0
>>>
>>>I want to identify the data points where Var1=0 and Var2=0, ( in this
>>>example shud be the points highlighted above), then calculate the time
>>>duration between these data points, (in this example, shud be 3min, 5 min
>>>and 4min), then identify the starting point of the max time duration ( in
>>>this example shud be the starting point of 5-min-duration, return the data
>>>points at 00:09), finally return the value in "Time" column ( in this
>>>example shud be "00:09")
>>>
>>>Thanks for your help!
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>>______________________________________________
>>>R-help at r-project.org mailing list
>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>
More information about the R-help
mailing list