[BioC] Is a number within a set of ranges?
Herve Pages
hpages at fhcrc.org
Mon Oct 29 21:33:58 CET 2007
Christos Hatzis wrote:
>> pos <- matrix(c(1, 5, 13, 3, 9, 15), ncol=2) pos
> [,1] [,2]
> [1,] 1 3
> [2,] 5 9
> [3,] 13 15
>> gene.pos <- c(14,4,10,6)
>> gene.pos
> [1] 14 4 10 6
>
>> within <- sapply(gene.pos, function(g) any(apply(pos, 1, function(x)
> findInterval(g, x)) == 1))
>
>> gene.pos[within]
> [1] 14 6
Good to know the existence of findInterval(). Thanks!
For this particular case though, I would be tempted to keep things simple
by replacing this
any(apply(pos, 1, function(x) findInterval(g, x)) == 1)
by
any(apply(pos, 1, function(x) x[1] <= g && g <= x[2]))
Not only is the later easier to understand, but with the former, you'll get
wrong results if one of your genes is positioned at one of the Stop positions:
gene.pos <- c(14,4,10,6,15) # last gene is at a Stop position
# using findInterval() gives:
> within
[1] TRUE FALSE FALSE TRUE FALSE
# using 'x[1] <= g && g <= x[2]' gives:
> within
[1] TRUE FALSE FALSE TRUE TRUE
Note that the "findInterval" method can be fixed by specifying
'rightmost.closed=TRUE' but this doesn't make the code easier to
understand, all the contrary...
Cheers,
H.
>
> Look at ?findInterval, which does all the work. It returns 1 if within
> range in this case.
>
> -Christos
>
>> -----Original Message-----
>> From: bioconductor-bounces at stat.math.ethz.ch
>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
>> Daniel Brewer
>> Sent: Monday, October 29, 2007 12:29 PM
>> To: bioconductor at stat.math.ethz.ch
>> Subject: [BioC] Is a number within a set of ranges?
>>
>> I have a table with a start and stop column which defines a
>> set of ranges. I have another table with a list of genes
>> with associated position. What I would like to do is subset
>> the gene table so it only contains genes whose position is
>> within any of the ranges. What is the best way to do this?
>> The only way I can think of is to construct a long list of
>> conditions linked by ORs but I am sure there must be a better way.
>>
>> Simple example:
>>
>> Start Stop
>> 1 3
>> 5 9
>> 13 15
>>
>> Gene Position
>> 1 14
>> 2 4
>> 3 10
>> 4 6
>>
>> I would like to get out:
>> Gene Position
>> 1 14
>> 4 6
>>
>> Any ideas?
>>
>> Thanks
>>
>> Dan
>>
>> --
>> **************************************************************
>> Daniel Brewer, Ph.D.
>> Institute of Cancer Research
>> Email: daniel.brewer at icr.ac.uk
>> **************************************************************
>>
>> The Institute of Cancer Research: Royal Cancer Hospital, a
>> charitable Company Limited by Guarantee, Registered in
>> England under Company No. 534147 with its Registered Office
>> at 123 Old Brompton Road, London SW7 3RP.
>>
>> This e-mail message is confidential and for use by the...{{dropped:13}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list