[R] Intersection of two sets of intervals

Thomas Meyer tm35 at cornell.edu
Wed Apr 15 19:36:40 CEST 2009


That does it perfectly -- and it's pretty much the same technique as 
used in the intervals pkg.

-tom

On 4/15/2009 12:43 PM, jim holtman wrote:
> Here is one way to find the overlaps:
> 
>> l1 <- rbind(c(1,3), c(5,10), c(13,24))
>> l2 <- rbind(c(2,4), c(7,14), c(20,30))
>> l1
>      [,1] [,2]
> [1,]    1    3
> [2,]    5   10
> [3,]   13   24
>> l2
>      [,1] [,2]
> [1,]    2    4
> [2,]    7   14
> [3,]   20   30
>> # create matrix for overlaps
>> start <- cbind(c(l1[,1], l2[,1]), 1)
>> end <- cbind(c(l1[,2], l2[, 2]), -1)
>> over <- rbind(start, end)
>> # order
>> over <- over[order(over[,1]),]
>> # get overlap count
>> over <- cbind(over, overlap=cumsum(over[,2]))
>> # create the overlap matrix
>> inter <- cbind(start=over[(over[,2] == 1) & (over[,3] == 2), 1],
> +                end=over[(over[,2] == -1) & (over[, 3] == 1), 1])
>> inter
>      start end
> [1,]     2   3
> [2,]     7  10
> [3,]    13  14
> [4,]    20  24
> 
> 
> On Wed, Apr 15, 2009 at 12:06 PM, Thomas Meyer <tm35 at cornell.edu> wrote:
>> Stavros, you are quite correct -- I discovered that the hard way a little
>> while ago when testing my two-line solution. Use of pmin/pmax don't handle,
>> for instance, cases where more than one interval in one set is wholly
>> contained by an interval in the other. (I have a mis-posted msg awaiting
>> moderator approval in R-help with a concrete example.)
>>
>> Your tip to check out the intervals pkg looks promising. FYI, there the
>> general intersection is computed as the complement of the union of the
>> complements, i.e. A*B = (A'+B')' , aka DeMorgan.
>>
>> Thanks for the help,
>>
>> -tom
>>
>> On 4/15/2009 11:27 AM, Stavros Macrakis wrote:
>>> I can see how pmax/pmin would easily allow the intersection of
>>> corresponding elements of two sequences of intervals, but I don't see
>>> how they help in intersecting two *sets* of intervals, which was the
>>> original problem statement.
>>>
>>>            -s
>>>
>>>
>>>
>>> On Wed, Apr 15, 2009 at 9:14 AM, ONKELINX, Thierry
>>> <Thierry.ONKELINX at inbo.be> wrote:
>>>> Not of the self but still not complicated:
>>>>
>>>> list1 <- data.frame(open=c(1,5), close=c(2,10))
>>>> list2 <- data.frame(open=c(1.5,3), close=c(2.5,10))
>>>>
>>>> Intersec <- data.frame(Open = pmax(list1$open, list2$open), Close =
>>>> pmin(list1$close, list2$close))
>>>> Intersec[Intersec$Open > Intersec$Close, ] <- NA
>>>> Intersec
>>>>
>>>> HTH,
>>>>
>>>> Thierry
>>>>
>>>> ------------------------------------------------------------------------
>>>> ----
>>>> ir. Thierry Onkelinx
>>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>>>> and Forest
>>>> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
>>>> methodology and quality assurance
>>>> Gaverstraat 4
>>>> 9500 Geraardsbergen
>>>> Belgium
>>>> tel. + 32 54/436 185
>>>> Thierry.Onkelinx at inbo.be
>>>> www.inbo.be
>>>>
>>>> To call in the statistician after the experiment is done may be no more
>>>> than asking him to perform a post-mortem examination: he may be able to
>>>> say what the experiment died of.
>>>> ~ Sir Ronald Aylmer Fisher
>>>>
>>>> The plural of anecdote is not data.
>>>> ~ Roger Brinner
>>>>
>>>> The combination of some data and an aching desire for an answer does not
>>>> ensure that a reasonable answer can be extracted from a given body of
>>>> data.
>>>> ~ John Tukey
>>>>
>>>> -----Oorspronkelijk bericht-----
>>>> Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>>> Namens Thomas Meyer
>>>> Verzonden: woensdag 15 april 2009 14:59
>>>> Aan: r-help at r-project.org
>>>> Onderwerp: [R] Intersection of two sets of intervals
>>>>
>>>> Hi,
>>>>
>>>> Algorithm question: I have two sets of "intervals", where an interval is
>>>>
>>>> an ordered pair [a,b] of two numbers. Is there an efficient way in R to
>>>> generate the intersection of two lists of same?
>>>>
>>>> For concreteness: I'm representing a set of intervals with a data.frame:
>>>>
>>>>  > list1 = as.data.frame(list(open=c(1,5), close=c(2,10)))
>>>>  > list1
>>>>  open close
>>>> 1    1     2
>>>> 2    5    10
>>>>
>>>>  > list2 = as.data.frame(list(open=c(1.5,3), close=c(2.5,10)))
>>>>  > list2
>>>>  open close
>>>> 1  1.5   2.5
>>>> 2  3.0  10.0
>>>>
>>>> How do I get the intersection which would be something like:
>>>>  open close
>>>> 1  1.5   2.0
>>>> 2  5.0  10.0
>>>>
>>>> I wonder if there's some ready-built functionality that might help me
>>>> out. I'm new to R and am still learning to vectorize my code and my
>>>> thinking. Or maybe there's a package for interval arithmetic that I can
>>>> just pull off the shelf.
>>>>
>>>> Thanks,
>>>>
>>>> -tom
>>>>
>>>> --
>>>> Thomas Meyer
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver
>>>> weer
>>>> en binden het INBO onder geen enkel beding, zolang dit bericht niet
>>>> bevestigd is
>>>> door een geldig ondertekend document. The views expressed in  this
>>>> message
>>>> and any annex are purely those of the writer and may not be regarded as
>>>> stating
>>>> an official position of INBO, as long as the message is not confirmed by
>>>> a duly
>>>> signed document.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
>




More information about the R-help mailing list