[R] find jumps in vector of repeats

arun smartpink111 at yahoo.com
Sat Oct 19 06:54:57 CEST 2013


In addition to Bill's method, you may also use:

vec1 <- rep(c(1,2,3,4,5), c(10,30,24,65,3))
 c(0,which(diff(vec2)!=0))
#or
 indx <- cumsum(rle(vec2)$lengths)
 c(0,indx[-length(indx)])


#Bill's method was found to be the fastest


vec3 <- rep(vec1,1e4)
system.time( res <- c(0,which(diff(vec3)!=0)))
#   user  system elapsed 
# 0.124   0.000   0.125 
 system.time({ indx <- cumsum(rle(vec3)$lengths)
 res2 <- c(0,indx[-length(indx)])})
#   user  system elapsed 
#   0.112   0.000   0.112 

 system.time({ indx <- which(isLastInRun(vec3))
 res3 <- c(0,indx[-length(indx)]) })
#   user  system elapsed 
#  0.088   0.000   0.086 
system.time({indx <- cumsum(c(0,abs(diff(vec3))))
 indx2 <- tapply(seq_along(indx),list(indx),FUN=max)
 res4 <- c(0,indx2[-length(indx2)]) })
#   user  system elapsed 
#  2.456   0.000   2.457 
 names(res4)<-NULL
 identical(res,res4)
#[1] TRUE
identical(res,res2)
#[1] TRUE
 identical(res,res3)
#[1] TRUE

A.K.



On Friday, October 18, 2013 8:31 PM, William Dunlap <wdunlap at tibco.com> wrote:
> I have a very long vector (length=1855190) it looks something like this
> 
> 1111...2222...3333....etc so it would be something equivalent of doing:
> rep(c(1,2,3,4,5), c(10,30,24,65,3))
> 
> How can I find the index of where the step/jump is? For example using the above I would
> get an index of 0, 10, 40, 64, 129

Define 2 functions:
     isFirstInRun <- function(x) c(TRUE, x[-1]!=x[-length(x)])
     isLastInRun <- function(x) c(x[-1]!=x[-length(x)], TRUE)
and use them as
     > z <- rep(c(1,2,3,4,5), c(10,30,24,65,3))
     > which(isLastInRun(z))
     [1]  10  40  64 129 132
     > which(isFirstInRun(z))
     [1]   1  11  41  65 130
(0 is not a valid R index into a vector, so I prefer one of
the above results, but you can fiddle with the endpoints
as you wish.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Benton, Paul
> Sent: Friday, October 18, 2013 5:18 PM
> To: r-help at r-project.org
> Subject: [R] find jumps in vector of repeats
> 
> Hello all,
> 
> I'm not really sure how to search for this in google/Rseek so there is probably a
> command to do it. I also know I could write an apply loop to find it but thought I would
> ask all you lovely R gurus.
> 
> I have a very long vector (length=1855190) it looks something like this
> 
> 1111...2222...3333....etc so it would be something equivalent of doing:
> rep(c(1,2,3,4,5), c(10,30,24,65,3))
> 
> How can I find the index of where the step/jump is? For example using the above I would
> get an index of 0, 10, 40, 64, 129
> 
> Any help would be greatly appreciated.
> 
> Cheers,
> 
> Paul
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list