[Bioc-devel] distanceToNearest Self Query
Valerie Obenchain
vobencha at fhcrc.org
Wed Sep 5 18:14:57 CEST 2012
Hi Dario,
Thanks for catching this. Now fixed in GenomicRanges 1.9.61.
It was already the case for nearest,GenomicRanges,GenomicRanges that
self-hits were found when both 'x' and 'subject' were present but not
when only 'x' was provided. The problem in distanceToNearest was that
when 'subject' was missing it was set equal to 'x' before calling
nearest. This resulted in self-hits being found regardless of how the
method was called. distanctToNearst now behaves appropriately,.
gr <- GRanges("1", IRanges(c(1, 10, 1000), width=2), "+")
> gr
GRanges with 3 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] 1 [ 1, 2] +
[2] 1 [ 10, 11] +
[3] 1 [1000, 1001] +
---
seqlengths:
1
NA
When only 'x' is provided no self-hits are found,
> distanceToNearest(gr)
DataFrame with 3 rows and 3 columns
queryHits subjectHits distance
<integer> <integer> <integer>
1 1 2 8
2 2 1 8
3 3 2 989
When both 'x' and 'subject' are provided self-hits are found,
> distanceToNearest(gr, gr)
DataFrame with 3 rows and 3 columns
queryHits subjectHits distance
<integer> <integer> <integer>
1 1 1 0
2 2 2 0
3 3 3 0
Valerie
On 09/04/2012 11:00 PM, Dario Strbenac wrote:
> Hello,
>
> In findOverlaps, there is an argument ignoreSelf, that means overlaps aren't found for the same range when only the query object is provided. I was hoping I could use something similar for the distanceToNearest function in GenomicRanges, with only the query provided. Would this be a worthwhile addition ?
>
> Example :
>
>> g1
> GRanges with 3 ranges and 0 metadata columns:
> seqnames ranges strand
> <Rle> <IRanges> <Rle>
> [1] 1 [ 1, 2] +
> [2] 1 [ 10, 11] +
> [3] 1 [1000, 1010] +
> ---
> seqlengths:
> 1
> NA
>> distanceToNearest(g1, ignoreSelf = TRUE) # Not implemented
> DataFrame with 3 rows and 3 columns
> queryHits subjectHits distance
> <integer> <integer> <integer>
> 1 1 1 0
> 2 2 2 0
> 3 3 3 0
>> findOverlaps(g1, ignoreSelf = TRUE) # Implemented
> Hits of length 0
> queryLength: 3
> subjectLength: 3
>
> --------------------------------------
> Dario Strbenac
> PhD Student
> University of Sydney
> Camperdown NSW 2050
> Australia
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
More information about the Bioc-devel
mailing list