[R] grouping by consecutive integers

Berton Gunter gunter.berton at gene.com
Mon Jul 24 19:27:45 CEST 2006


As you do not seem to have received what you consider to be  satisfactory
reply, here is a function that I **think** does what you want:

sequences <- function(x,incr = 1)
{
	ix <- which(abs(diff(c(FALSE,diff(x) == 1))) ==incr)
	if(length(ix)%%2)c(ix,length(x))
	else ix
}

This function gives successive pairs of first and last values of sequences
of increasing values within x that differ by incr. You can then process
these pairs however you like either to summarize 
statistics on the indices and/or the values of the sequences.

Examples:
> sequences(c(1:5,50,3:7))
[1]  1  5  7 11
> sequences(c(10,1:5,50,3:7))
[1]  2  6  8 12
> sequences(c(1:5,50,3:7,10))
[1]  1  5  7 11
> sequences(c(10,1:5,50,3:7,10))
[1]  2  6  8 12

Cheers,

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific learning
process."  - George E. P. Box
 
 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Kevin J Emerson
> Sent: Monday, July 24, 2006 9:20 AM
> To: Niels Vestergaard Jensen
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] grouping by consecutive integers
> 
> Let me clarify one thing that I dont think I made clear in my posting.
> I am looking for the max, min and median of the indicies, not of the
> time series frequency counts.  I am looking to find the max, min, and
> median time of peaks in a time series, so i am looking for the
> information concerning that. 
> 
> so mostly my question is how to extract the information of 
> max, min, and
> median of sequential numbers in a vector.  I will reword my original
> posting below.
> 
> > > Hello R-helpers!
> > >
> > > I have a question concerning extracting sequence 
> information from a
> > > vector.  I have a vector (representing the bins of a time 
> series where
> > > the frequency of occurrences is greater than some 
> threshold) where I
> > > would like to extract the min, median and max of each group of
> > > consecutive numbers in the index vector..
> > >
> > > For Example:
> > >
> > > tmp <- 
> c(24,25,29,35,36,37,38,39,40,41,42,43,44,45,46,47,68,69,70,71)
> > >
> > > I would like to have the max,min,median of the following groups:
> > >
> > > 24,25 - max = 25, min = 24 median = 24.5
> > > 29 max=min=median = 29
> > > 35,36,37,38,39,40,41,42,43,44,45,46,47, max = 45 min = 35 etc...
> > > 68,69,70,71
> > >
> > > I would like to be able to perform this for many time series so an
> > > automated process would be nice.  I am hoping to use this 
> as a peak
> > > detection protocol.
> > >
> > > Any advice would be greatly appreciated,
> > > Kevin
> > >
> > > -----
> > > -----
> > > Kevin J Emerson
> > > Center for Ecology and Evolutionary Biology
> > > 1210 University of Oregon
> > > Eugene, OR 97403
> > > USA
> > > kemerson at uoregon.edu
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list