[BioC] stranded findOverlaps

Michael.Dondrup at uni.no Michael.Dondrup at uni.no
Fri Jan 22 23:47:20 CET 2010


Hi Robert,
just a quick guess, maybe somebody knowing IRanges better may correct me.
I believe it's not directly possible to access the strand from the  
IRanges objects, because always start < end in the IRanges object.  
Thus, the direction of the interval has to be taken care of while the  
IRanges for the Ranged data are constructed, that's maybe the reason  
why there is no parameter for in-strand overlap.

Following approach might be simple enough though (sorry no code example):
- sort the data set of ranges (alignments, genes, sequencing reads)  
into two groups by their strand (I assume you have this info somewhere)
- construct two IRanges objects per set (aka query, reference), one  
for plus one for minus
- make one IRangesList per set, add corresponding IRanges objects,  
name them "plus" and "minus" in the list
- compute the overlap of the IRangesLists ( aka.: overlap(set1, set2) )
  -> you'll get the overlaps in strand
if you have chromosomes you construct two IRanges per chromosome in set

Does this make sense?
Michael


Zitat von Robert Castelo <robert.castelo at upf.edu>:

> dear list, and particularly, the IRanges developers,
>
> i'm using the function findOverlaps from the IRanges package because i
> need to find what stranded genomic intervals from one set (as a
> RangedData object) overlap with what stranded genomic intervals from
> another set (as another RangedData object). the problem is that i don't
> what to consider overlaps between genomic intervals from different
> strands.
>
> i've been looking to the help page of findOverlaps (devel version, see
> my sessionInfo() below) and searched through the BioC mailinglist and my
> preliminary conclusion is that such an operation is not yet supported.
>
> i've been thinking of using rdapply to break down the RangedData objects
> into spaces and then again by the two strands but the problem is that
> the query and subject indexes resulting of findOverlaps will not match
> the dimension of the original RangedData objects.
>
> so, i'd like to suggest that some option is added to this useful
> function to restrict the overlapping search by strand. of course, if
> this is somehow already implemented and i just missed it, then i'll be
> very grateful if you let me know what function/parameter i should be
> using.
>
>
> thanks a lot!!
> robert.
>
> sessionInfo()
> R version 2.11.0 Under development (unstable) (2009-10-06 r49948)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
> [1] IRanges_1.5.16
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:   
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list