[R] identifying a 'run' in a vector

Gabor Grothendieck ggrothendieck at gmail.com
Thu Jul 7 03:19:11 CEST 2011


On Wed, Jul 6, 2011 at 7:32 PM, B77S <bps0002 at auburn.edu> wrote:
> Hi,
>
> How can I discern which elements in x (see below) are in 'order', but more
> specifically.. only the 1st 'ordered run'?
> I would like for it to return elements 1:8... there may be ordered values
> after 1:8, but those are not of interest.
>
> x <- c(1, 2, 3, 4, 5, 6, 7, 8, 20, 21, 22, 45)
>
>

Since the definition of an ordered run is not given we assume that it
is a sequence of numbers which each increase by 1 over the prior
number.  If that is not it then you will need to clarify the problem
definition.

First calculate a logical vector which is TRUE at each position which
starts a new run.  Note that the first position in x always starts a
new run even if that run is a singleton so it can be set to TRUE.  The
remaining elements can be computed using diff as shown.   The
resulting logical vector is the argument to cumsum below.

Taking the cumulative sum of this logical vector gives a vector the
same length as x but with each element of the 1st run replaced with 1,
each element of the 2nd run replaced with 2 and so on.

Finally, since we only want the 1st run we pick out those positions of
x where the cumsum equals 1.

> x[cumsum(c(TRUE, diff(x) != 1)) == 1]
[1] 1 2 3 4 5 6 7 8


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list