[Bioc-sig-seq] find overlaps compatible with a transcript

Elizabeth Purdom epurdom at stat.berkeley.edu
Mon Sep 13 20:18:23 CEST 2010


  Hello,

I am using a TranscriptDb and trying to find overlaps with transcripts. 
For example, I have a gapped alignment and I want to see what 
transcripts it is compatible with. If txdb is my TranscriptDb, and gr is 
my gapped alignment as a GenomicRanges object, I can do findOverlaps to 
see if my read overlaps in any way overlaps with the individual exons of 
the transcript, but not whether it overlaps with the implied transcript. 
For example, if my gapped read overlaps exon 1,2,3 of the transcript, it 
can only be compatible if it overlaps in a particular way (it must 
contain the end of exon 1, the beginning of exon 3, and all of exon 2).

Is there a way to check this? This is probably answered somewhere, but I 
can't seem to find it.

Thanks,
Elizabeth

An example:
 > txdb <- loadFeatures(system.file("extdata", 
"UCSC_knownGene_sample.sqlite", package="GenomicFeatures"))
 > exByTx<-exonsBy(newtxdb$txdb,"tx")
#this is compatible
 > grOk<-GRanges(seqnames =c("chr1", "chr1", "chr1"), ranges 
=IRanges(c(2000,2476,3084),c(2090,2584,3089)), strand =rep("*",3))
#this is not
 > grNotOk<-GRanges(seqnames =c("chr1", "chr1", "chr1"),ranges = 
IRanges(c(2000,2500,3084),c(2090,2584,3089)),
strand =rep("*",3))
#both overlap the same set of transcripts, but the the second is not 
compatible with either transcript
 > findOverlaps(GRangesList(grOk,grNotOk),exByTx)
An object of class "RangesMatching"
Slot "matchMatrix":
      query subject
[1,]     1       1
[2,]     1       2
[3,]     2       1
[4,]     2       2

Slot "DIM":
[1]   2 135



More information about the Bioc-sig-sequencing mailing list