[R] Finding "runs" of TRUE in binary vector
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Thu Jan 27 23:49:27 CET 2005
Sean Davis <sdavis2 at mail.nih.gov> writes:
> I have a binary vector and I want to find all "regions" of that vector
> that are runs of TRUE (or FALSE).
>
> > a <- rnorm(10)
> > b <- a<0.5
> > b
> [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE
>
> My function would return something like a list:
> region[[1]] 1,3
> region[[2]] 5,5
> region[[3]] 7,10
>
> Any ideas besides looping and setting start and ends directly?
You could base it on
> rle(b)
Run Length Encoding
lengths: int [1:5] 1 1 2 4 2
values : logi [1:5] TRUE FALSE TRUE FALSE TRUE
> b
[1] TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE
(Notice that my b differs from yours)
then you might proceed with
> end <- cumsum(rle(b)$lengths)
> start <- rev(length(b) + 1 - cumsum(rev(rle(b)$lengths)))
> # or: start <- c(1, end[-length(end)] + 1)
> cbind(start,end)[rle(b)$values,]
start end
[1,] 1 1
[2,] 3 4
[3,] 9 10
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list