Hi, I am seeking help on designing an algorithm to identify the locations of
stretches of 1s in a vector of 0s and 1s. Below is an simple example:
> dat<-as.data.frame(cbind(a=c(F,F,T,T,T,T,F,F,T,T,F,T,T,T,T,F,F,F,F,T)
,b=c(4,12,13,16,18,20,28,30,34,46,47,49,61,73,77,84,87,90,95,97)))
> dat
a b
1 0 4
2 0 12
3 1 13
4 1 16
5 1 18
6 1 20
7 0 28
8 0 30
9 1 34
10 1 46
11 0 47
12 1 49
13 1 61
14 1 73
15 1 77
16 0 84
17 0 87
18 0 90
19 0 95
20 1 97
In this dataset, "b" is sorted and denotes the location for each number in "a".
So I would like to find the starting & ending locations for each stretch of 1s
within "a", also counting the number of 1s in each stretch as well. Hope the
results from the algorithm would be:
stretch start end No.of.1s
1 13 20 4
2 34 46 2
3 49 77 4
4 97 97 1
I can imagine using for loops can do the job, but I feel it's not a clever way
to do this. Is there an efficient algorithm that can do this fast?
Thanks for any suggestions.
John
[[alternative HTML version deleted]]