[R] How to search for a sequence(and its combination) inside a vector?

C W tmrsg11 at gmail.com
Sat Jul 20 20:06:26 CEST 2013


Hi, John
I am doing sparsity recovery from glmnet.

Elements 1, 2, 3 of the repeating sequence are the nonzero elements.
But it's not always recovered.

How my original data frame looks like
> df[1:15, ]
    i             x
2   1  0.0869399788
3   2 -0.0994713934
4   3  0.0720312837
5   4  0.0075392684
6   5  0.0130364386
7   6  0.0238318855
8   7 -0.0152197121
9   8  0.0097389626
10 13  0.0005068968
12  1  0.0679442455
13  2 -0.0647438953
14  3  0.0656297104
15  5  0.0003406059
16  7  0.0241146788
17  8  0.0093850612


I trimmed out the data column, only grabbing the index.

-M


On Sat, Jul 20, 2013 at 1:50 PM, C W <tmrsg11 at gmail.com> wrote:
> Thanks, you guys are correct, I had different data.
> But why I get length 5 and 6, should only be 1 to 3.
>
> Full R code :
>
> vec <- c(1, 2, 3, 4, 5, 6, 7, 8, 13, 1, 2, 3, 5, 7, 8, 10, 12, 13, 14,
> 15, 1, 2, 3, 5, 6, 10, 12, 13, 1, 2, 3, 4, 5, 6, 7, 12, 13, 14,
> 15, 1, 2, 3, 6, 9, 10, 11, 13, 14, 1, 7, 10, 13, 1, 2, 3, 4,
> 6, 7, 9, 11, 14, 1, 2, 3, 5, 9, 10, 11, 12, 14, 1, 2, 3, 4, 1,
> 2, 3, 4, 11, 12, 14, 1, 2, 3, 4, 8, 11, 12, 1, 2, 3, 4, 5, 7,
> 8, 9, 11, 12, 15, 3, 14, 1, 2, 3, 6, 10, 11, 13, 14, 1, 2, 3,
> 4, 5, 6, 8, 9, 10, 11, 12, 14, 1, 2, 3, 4, 9, 13, 15, 1, 2, 3,
> 4, 6, 8, 9, 11, 12, 1, 2, 3, 7, 8, 9, 14, 1, 2, 3, 12, 1, 2,
> 3, 4, 5, 10, 14, 1, 2, 3, 4, 5, 7, 8, 12, 13, 14, 1, 2, 3, 10,
> 1, 3, 1, 2, 3, 5, 7, 8, 10, 11, 13, 14, 1, 2, 3, 4, 5, 8, 9,
> 11, 12, 15, 1, 2, 3, 4, 7, 9, 10, 13, 1, 2, 3, 4, 5, 7, 10, 11,
> 15, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 15, 1, 2, 3, 6, 7,
> 8, 9, 10, 12, 13, 14, 15, 1, 2, 3, 4, 7, 1, 2, 3, 5, 8, 13, 1,
> 2, 3, 5, 8, 11, 15, 1, 2, 3, 1, 2, 3, 10, 1, 2, 3, 4, 7, 8, 9,
> 10, 11, 12, 14, 1, 3, 9, 11, 13, 14, 1, 2, 3, 4, 5, 7, 8, 9,
> 10, 11, 12, 13, 14, 15, 1, 2, 3, 4, 5, 13, 14, 15, 1, 2, 3, 11,
> 13, 14, 1, 2, 3, 8, 1, 2, 3, 4, 5, 6, 8, 11, 12, 14, 1, 2, 3,
> 5, 6, 9, 10, 11, 12, 15, 1, 2, 3, 4, 5, 9, 11, 12, 13, 1, 2,
> 3, 4, 5, 13, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15,
> 1, 2, 3, 7, 8, 9, 1, 2, 3, 5, 7, 8, 9, 10, 12, 14, 15, 1, 2,
> 3, 4, 5, 6, 8, 14, 1, 2, 3, 1, 2, 3, 10, 11, 13, 1, 2, 3, 4,
> 9, 10, 12, 13, 14, 1, 2, 3, 4, 5, 6, 12, 1, 2, 3, 4, 5, 6, 7,
> 10, 12, 13, 14, 15, 1, 2, 3, 6, 10, 14, 1, 2, 3, 4, 6, 7, 8,
> 9, 10, 11, 13, 14, 1, 2, 3, 1, 2, 3, 4, 7, 8, 10, 1, 2, 3, 7,
> 8, 11, 13, 15, 1, 2, 3, 4, 7, 8, 14, 15, 1, 2, 3, 4, 14, 1, 2,
> 3, 4, 6, 7, 10, 12, 1, 2, 3, 5, 7, 8, 11, 13, 14, 15, 1, 2, 3,
> 4, 1, 2, 3, 6, 7, 9, 11, 12, 13, 14, 1, 2, 3, 7, 11, 12, 1, 2,
> 3, 5, 6, 8, 9, 10, 12, 15, 1, 2, 3, 5, 6, 8, 9, 11, 1, 2, 3,
> 7, 8, 11, 13, 14, 15, 1, 2, 3, 4, 10, 12, 14, 1, 2, 3, 11, 12,
> 13, 15, 1, 2, 3, 5, 7, 10, 11, 12, 13, 14, 15, 1, 3, 10, 1, 2,
> 3, 1, 2, 3, 8, 10, 15, 1, 2, 3, 4, 7, 10, 12, 14, 1, 2, 3, 9,
> 10, 11, 1, 2, 3, 6, 9, 10, 15, 1, 9, 14, 1, 2, 3, 7, 10, 14,
> 1, 2, 3, 4, 7, 8, 9, 10, 11, 13, 15, 1, 2, 3, 5, 6, 7, 8, 9,
> 11, 12, 13, 14, 15, 1, 2, 3, 6, 8, 11, 12, 1, 7, 1, 2, 3, 8,
> 13, 15, 1, 2, 3, 4, 8, 9, 11, 1, 2, 3, 4, 7, 13, 14, 1, 2, 3,
> 5, 6, 9, 1, 3, 7, 12, 13, 15, 1, 2, 3, 5, 6, 8, 10, 1, 3, 5,
> 7, 8, 10, 11, 1, 2, 3, 5, 6, 11, 14, 1, 2, 3, 4, 9, 10, 11, 13,
> 14, 1, 2, 3, 4, 6, 9, 14, 15, 1, 2, 3, 11, 1, 2, 3, 4, 5, 7,
> 11, 13, 14, 15, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 14, 1, 2, 3, 4,
> 5, 6, 7, 9, 10, 11, 12, 14, 1, 4, 7, 1, 2, 3, 5, 6, 8, 9, 10,
> 12, 13, 15, 1, 2, 3, 4, 5, 8, 10, 12, 13)
> a <- vec %in% c(1, 2, 3)
> b <- rle(a)
> cc  <-  data.frame(b[[1]], b[[2]])
> names(cc)  <-  c("leng", 'val')
> dd  <-  subset(cc, val ==TRUE )
> table(dd)
>
>> table(dd)
>     val
> leng TRUE
>    1    5
>    2    4
>    3   81
>    5    1
>    6    4
>
> btw,
>> length(vec)
> [1] 762
>
> So, the tally should add up to that if correct.
>
> -M
>
>
> On Sat, Jul 20, 2013 at 1:41 PM, John Kane <jrkrideau at inbox.com> wrote:
>> Beats me. I get:
>> table(dd)
>>     val
>> leng TRUE
>>    1    3
>>    3   12
>>
>> What does dd look like.  In my case I get this where the first column is the row number
>> dd
>>    leng  val
>> 1     3 TRUE
>> 3     3 TRUE
>> 5     3 TRUE
>> 7     3 TRUE
>> 9     3 TRUE
>> 11    1 TRUE
>> 13    3 TRUE
>> 15    3 TRUE
>> 17    3 TRUE
>> 19    3 TRUE
>> 21    3 TRUE
>> 23    3 TRUE
>> 25    1 TRUE
>> 27    3 TRUE
>> 29    1 TRUE
>>
>> John Kane
>> Kingston ON Canada
>>
>>
>>> -----Original Message-----
>>> From: tmrsg11 at gmail.com
>>> Sent: Sat, 20 Jul 2013 13:11:47 -0400
>>> To: jrkrideau at inbox.com
>>> Subject: Re: [R] How to search for a sequence(and its combination) inside
>>> a vector?
>>>
>>> Thanks John.
>>>
>>> Why do I get length of 5 and 6?  I thought I am only tallying up 1 to 3?
>>>> table(dd)
>>>     val
>>> leng TRUE
>>>    1    5
>>>    2    4
>>>    3   81
>>>    5    1
>>>    6    4
>>>
>>> -M
>>>
>>> On Sat, Jul 20, 2013 at 12:52 PM, John Kane <jrkrideau at inbox.com> wrote:
>>>> Taking Berend's example a bit further, this seems to work
>>>>
>>>> If you use str(b) you will see it is a list
>>>>
>>>> b <- rle(a)
>>>> cc  <-  data.frame(b[[1]], b[[2]])
>>>> names(cc)  <-  c("leng", 'val')
>>>> dd  <-  subset(cc, val ==TRUE )
>>>> table(dd)
>>>>
>>>> John Kane
>>>> Kingston ON Canada
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: tmrsg11 at gmail.com
>>>>> Sent: Sat, 20 Jul 2013 12:36:55 -0400
>>>>> To: bhh at xs4all.nl
>>>>> Subject: Re: [R] How to search for a sequence(and its combination)
>>>>> inside
>>>>> a vector?
>>>>>
>>>>> Hi Berend
>>>>> I am looking for a table,
>>>>> # of times one element (out of 1, 2, 3) showed up, two elements, and
>>>>> all
>>>>> three.
>>>>>
>>>>> I am trying, don't know if this works:
>>>>>
>>>>>> aa <- rle(a)
>>>>>> b <- aa$lengths[aa$values]
>>>>>> table(b)
>>>>> b
>>>>>  1  3
>>>>>  3 12
>>>>>
>>>>> Mike
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jul 20, 2013 at 12:24 PM, Berend Hasselman <bhh at xs4all.nl>
>>>>> wrote:
>>>>>>
>>>>>> On 20-07-2013, at 18:05, C W <tmrsg11 at gmail.com> wrote:
>>>>>>
>>>>>>> Hi R list,
>>>>>>>
>>>>>>> I have a sequence repeating 1:15 .  Some numbers are deleted.  I want
>>>>>>> to find how many times 1, 2, 3 appeared.
>>>>>>> Basically, I want to "grab" the beginning of the sequence and tally
>>>>>>> it
>>>>>>> up.
>>>>>>>
>>>>>>> R code:
>>>>>>>
>>>>>>>> vec <- c(1, 2, 3, 4, 5, 6, 7, 8, 13, 1, 2, 3, 5, 7, 8, 10, 12, 13,
>>>>>>>> 14,
>>>>>>> 15, 1, 2, 3, 5, 6, 10, 12, 13, 1, 2, 3, 4, 5, 6, 7, 12, 13, 14,
>>>>>>> 15, 1, 2, 3, 6, 9, 10, 11, 13, 14, 1, 7, 10, 13, 1, 2, 3, 4,
>>>>>>> 6, 7, 9, 11, 14, 1, 2, 3, 5, 9, 10, 11, 12, 14, 1, 2, 3, 4, 1,
>>>>>>> 2, 3, 4, 11, 12, 14, 1, 2, 3, 4, 8, 11, 12, 1, 2, 3, 4, 5, 7,
>>>>>>> 8, 9, 11, 12, 15, 3, 14, 1, 2, 3, 6, 10, 11, 13, 14, 1)
>>>>>>>
>>>>>>>> a <- vec %in% c(1, 2, 3)
>>>>>>>> a
>>>>>>>  [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
>>>>>>> TRUE  TRUE FALSE FALSE
>>>>>>> [15] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE
>>>>>>> FALSE FALSE FALSE FALSE
>>>>>>> [29]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>>>>>> FALSE  TRUE  TRUE  TRUE
>>>>>>> [43] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
>>>>>>> TRUE  TRUE  TRUE FALSE
>>>>>>> [57] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE
>>>>>>> FALSE FALSE FALSE FALSE
>>>>>>> [71]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE
>>>>>>> FALSE  TRUE  TRUE  TRUE
>>>>>>> [85] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE
>>>>>>> FALSE FALSE FALSE FALSE
>>>>>>> [99] FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
>>>>>>> FALSE
>>>>>>> TRUE
>>>>>>>
>>>>>>>> rle(a)
>>>>>>> Run Length Encoding
>>>>>>>  lengths: int [1:29] 3 6 3 8 3 5 3 8 3 6 ...
>>>>>>>  values : logi [1:29] TRUE FALSE TRUE FALSE TRUE FALSE ...
>>>>>>>
>>>>>>> What should I do after this?
>>>>>>>
>>>>>>
>>>>>> Well how about
>>>>>>
>>>>>> sum(a)
>>>>>>
>>>>>> or
>>>>>>
>>>>>> b <- rle(a)
>>>>>> sum(b$lengths[b$values])
>>>>>>
>>>>>> Berend
>>>>>>
>>>>>>> Thanks,
>>>>>>> Mike
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ____________________________________________________________
>>>> Receive Notifications of Incoming Messages
>>>> Easily monitor multiple email accounts & access them with a click.
>>>> Visit http://www.inbox.com/notifier and check it out!
>>>>
>>>>
>>
>> ____________________________________________________________
>> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family!
>> Visit http://www.inbox.com/photosharing to find out more!
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list