[R] gapped sequence data summary

William Dunlap wdunlap at tibco.com
Mon Jul 26 18:48:28 CEST 2010


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of jd6688
> Sent: Monday, July 26, 2010 9:23 AM
> To: r-help at r-project.org
> Subject: [R] gapped sequence data summary
> 
> 
>    Id cat1 location item_values p-values sequence 
> a111 1 3002737     100             0.01       1 
> a112 1 3017821     102             0.05       2 
> a113 2 3027730     103             0.02       3 
> a114 2 3036220     104             0.04       4 
> a115 1 3053984     105             0.03       5 
> 
> a118 1 3090500     106             0.02       8 
> a119 1 3103304     107             0.03       9       
> a120 2 3090500     106             0.02       10 
> a121 2 3103304     107             0.03       11     
> 
> what I am trying to accomplish is:
> 
> for sequence 1:5
>        cat1        start of the location   end of the 
> location,   peak value
> of the item_values
>           1          3002737                    3053984       
>           105
>           2           3027730                    3036220      
>           104
> 
> for sequence 8:11
> 
>        cat1        start of the location   end of the 
> location,   peak value
> of the item_values
>           1          3090500                   3103304        
>           107
>           2          3090500                   3103304        
>           107
> 
> and so on...

To find which rows are the first and last row
of a run of numbers that differs by 1 you can use
the functions
    first <- function(x)c(TRUE, diff(x)!=1)
    last <- function(x)c(diff(x)!=1, TRUE)
on 'sequence'.

You can assign a group identifier to each run with
    runNumber <- cumsum(c(TRUE, diff(sequence)))
and use aggregate() or one of the functions in the
plyr package to apply a summary function to each
group.

If there might be NA's in the sequence variable you
will have to modify these a bit.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  


> 
> I have been trying to find a way to accomplish this, however, 
> I didn't find
> one that worked as expected.
> would you shed some light on this? Thanks,
> 
> 
>      
> -- 
> View this message in context: 
> http://r.789695.n4.nabble.com/gapped-sequence-data-summary-tp2
302552p2302552.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list