[BioC] CIGAR-aware coverage
Hervé Pagès
hpages at fhcrc.org
Fri Feb 28 08:29:38 CET 2014
Hi Chris,
On 02/27/2014 09:42 PM, Michael Lawrence wrote:
> On Thu, Feb 27, 2014 at 4:37 PM, chris warth <cswarth at gmail.com> wrote:
>
>> I would like to gather coverage data for gapped alignments. I am
>> using readGAlignmentsFromBam() to read a GAlignments object,
>> converting to an IRanges object and calling IRanges::coverage() , but
>> that ignores CIGAR strings.
>>
>
> There appears to be a message to BIOC-devel on CIGAR-aware coverage
>> ( https://www.mail-archive.com/bioc-devel@r-project.org/msg01460.html )
>> that suggests using the 'GenomicAlignments' package.
>>
>>
> GenomicAlignments is mostly a reorganization of pre-existing functionality.
> You should be able to call coverage() directly on your GAlignments, or on
> the BamFile itself, with full support for cigar strings. Also, it is
> sufficient to call readGAlignments(), i.e., no need for "FromBam".
>
>
>> As far as I can tell that is only available under 2.14, the
>> development version of bioconductor, which would also require running
>> R-devel.
>>
>> Is there any way to get CIGAR-aware coverage under bioconductor 2.13?
>>
>> Thanks in advance,
>>
>> Chris Warth
>> Fred Hutchinson CRC
>>
>> P.S. Not to distract from the above question, but I would also like
>> CIGAR-aware retrieval of alignments. readGAlignmentsFromBam()
>> retrieves alignments whose gap completely spans the range I am
>> interested in. I haven't found a way to exclude alignments that
>> don't have any actual base alignments overlapping the range I asked
>> for.
You cannot exclude alignments whose gap completely spans the range
you specify in the 'which' arg of the ScanBamParam object you pass
to readGAlignmentsFromBam(). However you can get rid of them right
after readGAlignmentsFromBam() returns with something like:
my_ROI <- GRanges("chr1, IRanges(1964, 2014)) # my region of interest
gal <- readGAlignmentsFromBam(bamfile, ScanBamParam(which=my_ROI))
subsetByOverlaps(gal, my_ROI)
This subsetting will only keep those alignments that have at least 1
aligned nucleotide that falls within 'my_ROI'.
Cheers,
H.
>>
>> ====
>>> sessionInfo()
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] C
>>
>> attached base packages:
>> [1] parallel stats graphics grDevices utils datasets methods
>> [8] base
>>
>> other attached packages:
>> [1] wavelets_0.3-0 vegan_2.0-10 lattice_0.20-24
>> [4] permute_0.8-3 rtracklayer_1.21.14 Rsamtools_1.13.52
>> [7] Biostrings_2.29.19 BiocInstaller_1.12.0 GenomicRanges_1.14.4
>> [10] XVector_0.1.4 IRanges_1.20.6 BiocGenerics_0.7.8
>>
>> loaded via a namespace (and not attached):
>> [1] BSgenome_1.29.1 RCurl_1.95-4.1 XML_3.98-1.1 bitops_1.0-6
>> [5] compiler_3.0.2 grid_3.0.2 stats4_3.0.2 tools_3.0.2
>> [9] zlibbioc_1.7.0
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list