[BioC] countOverlaps() only count positive strand (from GenomicRanges example)

Song Li songli116 at gmail.com
Mon Dec 13 18:06:22 CET 2010


Hi All,

I was following the example in the "GenomicRanges Use Cases", section
3, "Simple RNA-seq Analysis", by "copy-paste" command from the
document from page 7 to page 8:

What I found out is that countOverlap() only count reads aligned to
positive strand.  There are 28 reads map to region GRList[6645], 13 on
positive strand, 15 on negative strand.  The "count[6645]" is 13.

Is there a way to count both strand?

Thanks,
Song Li

Here are all the code:
#-----> Start with code from the package description<-------#
> library(Rsamtools)
> testFile <- system.file("bam", "isowt5_13e.bam", package = "leeBamViews")
> aligns <- readBamGappedAlignments(testFile)
> library(GenomicFeatures)
> txdb <- makeTranscriptDbFromUCSC(genome = "sacCer2", tablename = "sgdGene")
> exonRanges <- exonsBy(txdb, "tx")
> length(exonRanges)
> levels(rname(aligns)) <- c(paste("chr", as.roman(1:16),
+     sep = ""), "chrM")
> counts <- countOverlaps(exonRanges, aligns)

#-----> find alignments that map to GRList$6645<-------#
> aligns[705:732]
GappedAlignments of length 28
       rname strand cigar qwidth  start    end width ngap
[1]  chrXIII      +   36M     36 804443 804478    36    0
[2]  chrXIII      +   36M     36 804466 804501    36    0
[3]  chrXIII      -   36M     36 804473 804508    36    0
[4]  chrXIII      +   36M     36 804476 804511    36    0
[5]  chrXIII      -   36M     36 804493 804528    36    0
[6]  chrXIII      +   36M     36 804516 804551    36    0
[7]  chrXIII      +   36M     36 804562 804597    36    0
[8]  chrXIII      +   36M     36 804562 804597    36    0
[9]  chrXIII      -   36M     36 804574 804609    36    0
...      ...    ...   ...    ...    ...    ...   ...  ...
[20] chrXIII      +   36M     36 804764 804799    36    0
[21] chrXIII      -   36M     36 804798 804833    36    0
[22] chrXIII      +   36M     36 804799 804834    36    0
[23] chrXIII      +   36M     36 804799 804834    36    0
[24] chrXIII      -   36M     36 804816 804851    36    0
[25] chrXIII      -   36M     36 804947 804982    36    0
[26] chrXIII      -   36M     36 804953 804988    36    0
[27] chrXIII      -   36M     36 804955 804990    36    0
[28] chrXIII      -   36M     36 804974 805009    36    0
> exonRanges[6646]
GRangesList of length 1
$6646
GRanges with 1 range and 3 elementMetadata values
    seqnames           ranges strand |   exon_id   exon_name exon_rank
       <Rle>        <IRanges>  <Rle> | <integer> <character> <integer>
[1]  chrXIII [804455, 805090]      + |      7011          NA         1


seqlengths
   chrIV   chrXV  chrVII  chrXII  chrXVI ...   chrVI    chrI    chrM 2micron
 1531919 1091289 1090947 1078175  948062 ...  270148  230208   85779    6318
> counts[6646]
[1] 13
> table(strand(aligns[705:732]))

 +  -
13 15


> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Rsamtools_1.2.1       Biostrings_2.18.0     GenomicFeatures_1.2.3
[4] GenomicRanges_1.2.1   IRanges_1.8.2

loaded via a namespace (and not attached):
[1] Biobase_2.10.0     biomaRt_2.6.0      BSgenome_1.18.1    DBI_0.2-5
[5] RCurl_1.4-3        RSQLite_0.9-3      rtracklayer_1.10.5 tools_2.12.0
[9] XML_3.2-0

Song Li
-- 
Postdoctoral Associate
Institute for Genome Sciences and Policy
Duke University



More information about the Bioconductor mailing list