[R] Sequence detection longer than a certain value

Bert Gunter gunter.berton at gene.com
Tue Aug 21 20:59:50 CEST 2012


Rui:

It's much simpler than you propose, which is why I left it to the OP.
Just use the results of rle to create a logical vector to index id.
For example, the solution for the OP's example becomes : (d is the
data frame containing id and VI)

> with(d,{
+ z <- rle(VI < 1)
+ vals <- with(z,values & lengths >=5) ## TRUE only when both
conditions are satisfied
+ id[rep(vals,z$lengths)]
+ })
[1] 4 5 6 7 8
## The +'s are just continuation prompts on the console

-- Bert





On Tue, Aug 21, 2012 at 11:35 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
> Hello,
>
> Try the following
>
>
> d <- read.table(text="
> id VI
> 1 -10
> 2  -4
> 3  5
> 4 -2
> 5 -5
> 6 -3
> 7 -2
> 8 -1
> 9  4
> 10 8
> ", header = TRUE)
>
>
> fun <- function(n, thres){
>     r <- rle(!d$VI < thres)
>     inx <- which(!r$values & r$lengths >= n)
>     csum <- cumsum(r$lengths)
>     from <- ifelse(inx == 1, 1, csum[inx - 1] + 1)
>     cbind(from = from, to = csum[inx])
>
> }
> fun(5, 1)
> fun(2, 1)
>
>
> Hope this helps,
>
> Rui Barradas
> Em 21-08-2012 18:54, inti luna escreveu:
>>
>> Hello,
>>
>> I have 2 variable: one is an "id" sequence from 1:1000 and the other is
>> variable with real values "VI" from -15.0 to 20.0 and I want to detect id
>> values that have indicator values less than a certain threshold, for
>> example (x=1) BUT that are in sequence equal or longer than 5.
>>
>> For instance, in the following column I want to recognize the sequence
>> from
>> "id" 4 to 8 that are values with a "VI" values lower than 1 in a sequence
>> of 5, and not the id values 1 and 2 which are values with VI lower than my
>> threshold but the sequence is lower than 5.
>>
>> id VI
>>
>> 1 -10
>> 2  -4
>> 3  5
>> 4 -2
>> 5 -5
>> 6 -3
>> 7 -2
>> 8 -1
>> 9  4
>> 10 8
>>
>>   Any help would be appreciated!
>>
>> Inti
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm




More information about the R-help mailing list