[R] subset only if f.e a column is successive for more than 3 values
Knut Krueger
rhe|p @end|ng |rom krueger-|@m||y@de
Fri Sep 28 17:08:31 CEST 2018
Hi Jim,
thank's it is working with the given example,
but whats the difference when using
testdata=data.frame(TIME=c("17:11:20", "17:11:21", "17:11:22",
"17:11:23", "17:11:24", "17:11:25", "17:11:26", "17:11:27", "17:11:28",
"17:21:43",
"17:22:16", "17:22:19", "18:04:48", "18:04:49",
"18:04:50", "18:04:51", "18:04:52", "19:50:09", "00:59:27", "00:59:28",
"00:59:29", "04:13:40", "04:13:43", "04:13:44"),
index=c(8960,8961,8962,8963,8964,8965,8966,8967,8968,9583,9616,9619,12168,12169,12170,12171,12172,18489
,37047,37048,37049,48700,48701,48702))
seqindx<-rle(diff(testdata$index)==1)
runsel<-seqindx$lengths >= 3 & seqindx$values
# get the indices for the starts of the runs
starts<-cumsum(seqindx$lengths)[runsel[-1]]+1
# and the ends
ends<-cumsum(seqindx$lengths)[runsel]+1
eval(parse(text=paste0("testdata[c(",paste(starts,ends,sep=":",collapse=","),"),]")))
the result (index) is
12168,9619,9616,9583,8968,12168,12169,12170,12171,12172
maybe the gaps between .. 8967,8968,9583,9616,9619,12168,12169 ..?
Regards Knut
More information about the R-help
mailing list