[R] overlapping intervals

jim holtman jholtman at gmail.com
Mon Oct 16 02:34:35 CEST 2006


Not the most efficient and requires integer values (maybe less than
1M). My results show an additional overlap at 40 - start & end were
the same -- does this count?  If not, just delete rows that are the
same in both columns.


> series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350))
> series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000))
> x1 <- x2 <- logical(max(series1, series2))  # vector FALSE
> x1[unlist(mapply(seq, series1[,1], series1[,2]))] <- TRUE
> x2[unlist(mapply(seq, series2[,1], series2[,2]))] <- TRUE
> r <- rle(x1 & x2)  # determine overlaps
> offset <- cumsum(r$lengths)
> (z <- cbind(offset[r$values] - r$lengths[r$values] + 1, offset[r$values]))
     [,1] [,2]
[1,]   25   26
[2,]   40   40
[3,]   60   70
[4,]  300  350
> # if you don't like dups for overlaps (@40)
> z[z[,1] != z[,2],]
     [,1] [,2]
[1,]   25   26
[2,]   60   70
[3,]  300  350

On 10/15/06, Giovanni Coppola <gcoppola at ucla.edu> wrote:
> Hello everybody,
>
> I have two series of intervals, and I'd like to output the shared
> regions.
> For example:
> series1<-cbind(Start=c(10,21,40,300),End=c(20,26,70,350))
> series2<-cbind(Start=c(25,60,210,500),End=c(40,100,400,1000))
>
>  > series1
>      Start End
> [1,]    10  20
> [2,]    21  26
> [3,]    40  70
> [4,]   300 350
>  > series2
>      Start  End
> [1,]    25   40
> [2,]    60  100
> [3,]   210  400
> [4,]   500 1000
>
> I'd like to have something like this as result:
>  > shared
>      Start End
> [1,]    25  26
> [2,]    60  70
> [3,]   300 350
>
> I found this post, but the solution finds the regions shared across
> all the intervals.
> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59594.html
> Can anybody help me with this?
> Thanks
> Giovanni
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list