[R] counting sets of consecutive integers in a vector

Mike Miller mbmiller+l at gmail.com
Mon Jan 5 01:03:03 CET 2015

```I have a vector of sorted positive integer values (e.g., postive integers
after applying sort() and unique()).  For example, this:

c(1,2,5,6,7,8,25,30,31,32,33)

I want to make a matrix from that vector that has two columns: (1) the
first value in every run of consecutive integer values, and (2) the
corresponding number of consecutive values.  For example:

c(1:20) would become this...

1  20

...because there are 20 consecutive integers beginning with 1 and
c(1,2,5,6,7,8,25,30,31,32,33) would become

1  2
5  4
25 1
30 4

What would be the best way to accomplish this?  Here is my first effort:

v <- c(1,2,5,6,7,8,25,30,31,32,33)
L <- rle( v - 1:length(v) )\$lengths
n <- length( L )
matrix( c( v[ c( 1, cumsum(L)+1 ) ][1:n], L), nrow=n)

[,1] [,2]
[1,]    1    2
[2,]    5    4
[3,]   25    1
[4,]   30    4

I suppose that works well enough, but there may be a better way, and
besides, I wouldn't want to deny anyone here the opportunity to solve a
fun puzzle.  ;-)

The use for this is that I will be doing repeated seeks of a binary file
to extract data.  seek() gives the starting point and readBin(n=X) gives
the number of bytes to read.  So when there are many consecutive variables
to be read, I can multiply the X in n=X by that number instead of doing
many different seek() calls.  (The data are in a transposed format where I
read in every record for some variable as sequential elements.)  I'm
probably not the first person to deal with this.

Best,

Mike

--
Michael B. Miller, Ph.D.
University of Minnesota