[BioC] summarizeOverlaps vs countOverlaps

swaraj basu projectbasu at gmail.com
Wed Feb 29 19:24:16 CET 2012


Dear All,

I am getting completely different results for the count of mapped reads on
transcript features using the countOverlaps and summarizeOverlaps.

CountOverlaps on using a gappedAlignment and Granges object gives slightly
different results. SummarizeOverlaps gives totally different results for
the modes "Union" and "intersectionNotEmpty".

Can someone please let me know where I am making a mistake. I have attached
a screenshot of the transcripts. All the transcripts belong to the same
gene.

Following steps were performed by me on a small subset of my data

1. I built a gapped alignment object of mapped reads using the command
reads=readBamGappedAlignments("path / to / bam/sorted.bam")

2. From the gapped alignment I built a genomic ranges object
ranges=GRanges(seqnames=rname(reads),
ranges=IRanges(start=start(reads),
end=end(reads)),strand=rep("*",length(reads)))

3. I built a genomic ranges list from a custom genedb object (4
transcripts).
geneE=exonsBy(genedb,'gene')

4. Used the count and summarize overlaps functions
a). count1=countOverlaps(geneE, reads)
b). count2=countOverlaps(geneE, ranges)
c). count3=assays(summarizeOverlaps(geneE, reads,
mode="UNION"))$counts[,1]
c). count4=assays(summarizeOverlaps(geneE, reads,
mode="intersectionNotEmpty"))$counts[,1]

5. I get four different count results
"T1, T2, T3, T4"
a). count1 "55972, 41029, 18270, 18734"
b). count2 "55989, 41046, 18287, 18751"
c). count3 "0, 2, 0, 1092"
d). count4 "0, 3712, 0, 2550"

-- 
Swaraj Basu
PhD Student (Bioinformatics - Functional Genomics)
Animal Physiology and Evolution
Stazione Zoologica Anton Dohrn
Naples
-------------- next part --------------
A non-text attachment was scrubbed...
Name: transcripts.png
Type: image/png
Size: 12062 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20120229/3e4c18b1/attachment.png>


More information about the Bioconductor mailing list