[BioC] countOverlaps() only count positive strand (from GenomicRanges example)

Hervé Pagès hpages at fhcrc.org
Mon Dec 13 22:38:19 CET 2010


Hi Song, Vince, Martin, Jason,

On 12/13/2010 11:22 AM, Jason wrote:
[...]
>
> I just encountered the same problem.
> What I did is this
>
> aln<- grg(readBamGappedAlignments(bam))
> strand(aln) = "*"
> countOverlaps(exonRanges, aln)

FWIW I just added a "strand<-" method for GappedAlignments objects
to the devel version of the GenomicRanges package (1.3.7). From
the (updated) man page for GappedAlignments:

 > subject
GRanges with 1 range and 0 elementMetadata values
     seqnames    ranges strand |
        <Rle> <IRanges>  <Rle> |
[1]     seq1   [1, 36]      + |

 > galn[8:10]
GappedAlignments of length 3
     rname strand cigar qwidth start end width ngap
[1]  seq1      +   35M     35    15  49    35    0
[2]  seq1      -   35M     35    18  52    35    0
[3]  seq1      +   35M     35    22  56    35    0

 > countOverlaps(galn[8:10], subject)
[1] 1 0 1

 > strand(galn) <- "*"

 > galn[8:10]
GappedAlignments of length 3
     rname strand cigar qwidth start end width ngap
[1]  seq1      *   35M     35    15  49    35    0
[2]  seq1      *   35M     35    18  52    35    0
[3]  seq1      *   35M     35    22  56    35    0
 > countOverlaps(galn[8:10], subject)
[1] 1 1 1

Just to clarify: because GappedAlignments objects follow the Samtools
convention to provide CIGARs that are *always* relative to the + strand
(even for alignments that are on the - strand), modifying the strand
of a GappedAlignments object doesn't alter its cigar field (or any other
field).

Cheers,
H.


 > sessionInfo()
R version 2.13.0 Under development (unstable) (2010-11-28 r53696)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C
  [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8
  [5] LC_MONETARY=C             LC_MESSAGES=en_US.utf8
  [7] LC_PAPER=en_US.utf8       LC_NAME=C
  [9] LC_ADDRESS=C              LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Rsamtools_1.3.11    Biostrings_2.19.2   GenomicRanges_1.3.7
[4] IRanges_1.9.17

loaded via a namespace (and not attached):
[1] Biobase_2.11.6 tools_2.13.0

>
>
> Jason
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list