[BioC] countOverlaps() only count positive strand (from GenomicRanges example)
Song Li
songli116 at gmail.com
Mon Dec 13 18:06:22 CET 2010
Hi All,
I was following the example in the "GenomicRanges Use Cases", section
3, "Simple RNA-seq Analysis", by "copy-paste" command from the
document from page 7 to page 8:
What I found out is that countOverlap() only count reads aligned to
positive strand. There are 28 reads map to region GRList[6645], 13 on
positive strand, 15 on negative strand. The "count[6645]" is 13.
Is there a way to count both strand?
Thanks,
Song Li
Here are all the code:
#-----> Start with code from the package description<-------#
> library(Rsamtools)
> testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews")
> aligns <- readBamGappedAlignments(testFile)
> library(GenomicFeatures)
> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene")
> exonRanges <- exonsBy(txdb, "tx")
> length(exonRanges)
> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16),
+ sep = ""), "chrM")
> counts <- countOverlaps(exonRanges, aligns)
#-----> find alignments that map to GRList$6645<-------#
> aligns[705:732]
GappedAlignments of length 28
rname strand cigar qwidth start end width ngap
[1] chrXIII + 36M 36 804443 804478 36 0
[2] chrXIII + 36M 36 804466 804501 36 0
[3] chrXIII - 36M 36 804473 804508 36 0
[4] chrXIII + 36M 36 804476 804511 36 0
[5] chrXIII - 36M 36 804493 804528 36 0
[6] chrXIII + 36M 36 804516 804551 36 0
[7] chrXIII + 36M 36 804562 804597 36 0
[8] chrXIII + 36M 36 804562 804597 36 0
[9] chrXIII - 36M 36 804574 804609 36 0
... ... ... ... ... ... ... ... ...
[20] chrXIII + 36M 36 804764 804799 36 0
[21] chrXIII - 36M 36 804798 804833 36 0
[22] chrXIII + 36M 36 804799 804834 36 0
[23] chrXIII + 36M 36 804799 804834 36 0
[24] chrXIII - 36M 36 804816 804851 36 0
[25] chrXIII - 36M 36 804947 804982 36 0
[26] chrXIII - 36M 36 804953 804988 36 0
[27] chrXIII - 36M 36 804955 804990 36 0
[28] chrXIII - 36M 36 804974 805009 36 0
> exonRanges[6646]
GRangesList of length 1
$6646
GRanges with 1 range and 3 elementMetadata values
seqnames ranges strand | exon_id exon_name exon_rank
<Rle> <IRanges> <Rle> | <integer> <character> <integer>
[1] chrXIII [804455, 805090] + | 7011 NA 1
seqlengths
chrIV chrXV chrVII chrXII chrXVI ... chrVI chrI chrM 2micron
1531919 1091289 1090947 1078175 948062 ... 270148 230208 85779 6318
> counts[6646]
[1] 13
> table(strand(aligns[705:732]))
+ -
13 15
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rsamtools_1.2.1 Biostrings_2.18.0 GenomicFeatures_1.2.3
[4] GenomicRanges_1.2.1 IRanges_1.8.2
loaded via a namespace (and not attached):
[1] Biobase_2.10.0 biomaRt_2.6.0 BSgenome_1.18.1 DBI_0.2-5
[5] RCurl_1.4-3 RSQLite_0.9-3 rtracklayer_1.10.5 tools_2.12.0
[9] XML_3.2-0
Song Li
--
Postdoctoral Associate
Institute for Genome Sciences and Policy
Duke University
More information about the Bioconductor
mailing list