[R] range segment exclusion using range endpoints
William Dunlap
wdunlap at tibco.com
Mon May 14 20:38:05 CEST 2012
To the list of function I sent, add another that converts a list of intervals
into a Ranges object:
as.Ranges.list <- function (x, ...) {
stopifnot(nargs() == 1, all(vapply(x, length, 0) == 2))
# use c() instead of unlist() because c() doesn't mangle POSIXct and Date objects
x <- unname(do.call(c, x))
odd <- seq(from = 1, to = length(x), by = 2)
as.Ranges(bottoms = x[odd], tops = x[odd + 1])
}
Then stop using get() and assign() all over the place and instead make lists of
related intervals and convert them to Ranges objects:
> x <- as.Ranges(list(x_rng))
> s <- as.Ranges(list(s1_rng, s2_rng, s3_rng, s4_rng, s5_rng))
> x
bottoms tops
1 -100 100
> s
bottoms tops
1 -250.50 30.0
2 0.77 10.0
3 25.00 35.0
4 70.00 80.3
5 90.00 95.0
and then compute the difference between the sets x and s (i.e., describe
the points in x but not s as a union of intervals):
> setdiffRanges(x, s)
bottoms tops
1 35.0 70
2 80.3 90
3 95.0 100
and for a graphical check do
> plot(x, s, setdiffRanges(x, s))
Are those the numbers you want?
I find it easier to use standard functions and data structures for this than
to adapt the cumsum/order idiom to different situations.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Ben quant
> Sent: Monday, May 14, 2012 11:07 AM
> To: jim holtman
> Cc: r-help at r-project.org
> Subject: Re: [R] range segment exclusion using range endpoints
>
> Turns out this solution doesn't work if the s range is outside the range of
> the x range. I didn't include that in my examples, but it is something I
> have to deal with quite often.
>
> For example s1_rng below causes an issue:
>
> x_rng = c(-100,100)
> s1_rng = c(-250.5,30)
> s2_rng = c(0.77,10)
> s3_rng = c(25,35)
> s4_rng = c(70,80.3)
> s5_rng = c(90,95)
>
> sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> for (i in sNames){
> queue <- rbind(queue
> , c(get(i)[1], 1) # enter queue
> , c(get(i)[2], -1) # exit queue
> )
> }
> queue <- queue[order(queue[, 1]), ] # sort
> queue <- cbind(queue, cumsum(queue[, 2])) # of people in the queue
> for (i in which(queue[, 3] == 1)){
> cat("start:", queue[i, 1L], ' end:', queue[i + 1L, 1L], "\n")
> }
>
> Regards,
>
> ben
> On Sat, May 12, 2012 at 12:50 PM, jim holtman <jholtman at gmail.com> wrote:
>
> > Here is an example of how you might do it. It uses a technique of
> > counting how many items are in a queue based on their arrival times;
> > it can be used to also find areas of overlap.
> >
> > Note that it would be best to use a list for the 's' end points
> >
> > ================================
> > > # note the next statement removes names of the format 's[0-9]+_rng'
> > > # it would be best to create a list with the 's' endpoints, but this is
> > > # what the OP specified
> > >
> > > rm(list = grep('s[0-9]+_rng', ls(), value = TRUE)) # Danger Will
> > Robinson!!
> > >
> > > # ex 1
> > > x_rng = c(-100,100)
> > >
> > > s1_rng = c(-25.5,30)
> > > s2_rng = c(0.77,10)
> > > s3_rng = c(25,35)
> > > s4_rng = c(70,80.3)
> > > s5_rng = c(90,95)
> > >
> > > # ex 2
> > > # x_rng = c(-50.5,100)
> > >
> > > # s1_rng = c(-75.3,30)
> > >
> > > # ex 3
> > > # x_rng = c(-75.3,30)
> > >
> > > # s1_rng = c(-50.5,100)
> > >
> > > # ex 4
> > > # x_rng = c(-100,100)
> > >
> > > # s1_rng = c(-105,105)
> > >
> > > # find all the names -- USE A LIST NEXT TIME
> > > sNames <- grep("s[0-9]+_rng", ls(), value = TRUE)
> > >
> > > # initial matrix with the 'x' endpoints
> > > queue <- rbind(c(x_rng[1], 1), c(x_rng[2], 1))
> > >
> > > # add the 's' end points to the list
> > > # this will be used to determine how many things are in a queue (or
> > areas that
> > > # overlap)
> > > for (i in sNames){
> > + queue <- rbind(queue
> > + , c(get(i)[1], 1) # enter queue
> > + , c(get(i)[2], -1) # exit queue
> > + )
> > + }
> > > queue <- queue[order(queue[, 1]), ] # sort
> > > queue <- cbind(queue, cumsum(queue[, 2])) # of people in the queue
> > > print(queue)
> > [,1] [,2] [,3]
> > [1,] -100.00 1 1
> > [2,] -25.50 1 2
> > [3,] 0.77 1 3
> > [4,] 10.00 -1 2
> > [5,] 25.00 1 3
> > [6,] 30.00 -1 2
> > [7,] 35.00 -1 1
> > [8,] 70.00 1 2
> > [9,] 80.30 -1 1
> > [10,] 90.00 1 2
> > [11,] 95.00 -1 1
> > [12,] 100.00 1 2
> > >
> > > # print out values where the last column is 1
> > > for (i in which(queue[, 3] == 1)){
> > + cat("start:", queue[i, 1L], ' end:', queue[i + 1L, 1L], "\n")
> > + }
> > start: -100 end: -25.5
> > start: 35 end: 70
> > start: 80.3 end: 90
> > start: 95 end: 100
> > >
> > >
> > =========================================
> >
> > On Sat, May 12, 2012 at 1:54 PM, Ben quant <ccquant at gmail.com> wrote:
> > > Hello,
> > >
> > > I'm posting this again (with some small edits). I didn't get any replies
> > > last time...hoping for some this time. :)
> > >
> > > Currently I'm only coming up with brute force solutions to this issue
> > > (loops). I'm wondering if anyone has a better way to do this. Thank you
> > for
> > > your help in advance!
> > >
> > > The problem: I have endpoints of one x range (x_rng) and an unknown
> > number
> > > of s ranges (s[#]_rng) also defined by the range endpoints. I'd like to
> > > remove the x ranges that overlap with the s ranges. The examples below
> > > demonstrate what I mean.
> > >
> > > What is the best way to do this?
> > >
> > > Ex 1.
> > > For:
> > > x_rng = c(-100,100)
> > >
> > > s1_rng = c(-25.5,30)
> > > s2_rng = c(0.77,10)
> > > s3_rng = c(25,35)
> > > s4_rng = c(70,80.3)
> > > s5_rng = c(90,95)
> > >
> > > I would get:
> > > -100,-25.5
> > > 35,70
> > > 80.3,90
> > > 95,100
> > >
> > > Ex 2.
> > > For:
> > > x_rng = c(-50.5,100)
> > >
> > > s1_rng = c(-75.3,30)
> > >
> > > I would get:
> > > 30,100
> > >
> > > Ex 3.
> > > For:
> > > x_rng = c(-75.3,30)
> > >
> > > s1_rng = c(-50.5,100)
> > >
> > > I would get:
> > > -75.3,-50.5
> > >
> > > Ex 4.
> > > For:
> > > x_rng = c(-100,100)
> > >
> > > s1_rng = c(-105,105)
> > >
> > > I would get something like:
> > > NA,NA
> > > or...
> > > NA
> > >
> > > Ex 5.
> > > For:
> > > x_rng = c(-100,100)
> > >
> > > s1_rng = c(-100,100)
> > >
> > > I would get something like:
> > > -100,-100
> > > 100,100
> > > or just...
> > > -100
> > > 100
> > >
> > > PS - You may have noticed that in all of the examples I am including the
> > s
> > > range endpoints in the desired results, which I can deal with later in my
> > > program so its not a problem... I think leaving in the s range endpoints
> > > simplifies the problem.
> > >
> > > Thanks!
> > > Ben
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > Jim Holtman
> > Data Munger Guru
> >
> > What is the problem that you are trying to solve?
> > Tell me what you want to do, not how you want to do it.
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list