[BioC] CIGAR-aware coverage
Hervé Pagès
hpages at fhcrc.org
Fri Feb 28 08:32:25 CET 2014
On 02/27/2014 11:29 PM, Hervé Pagès wrote:
> Hi Chris,
>
> On 02/27/2014 09:42 PM, Michael Lawrence wrote:
>> On Thu, Feb 27, 2014 at 4:37 PM, chris warth <cswarth at gmail.com> wrote:
>>
>>> I would like to gather coverage data for gapped alignments. I am
>>> using readGAlignmentsFromBam() to read a GAlignments object,
>>> converting to an IRanges object and calling IRanges::coverage() , but
>>> that ignores CIGAR strings.
>>>
>>
>> There appears to be a message to BIOC-devel on CIGAR-aware coverage
>>> ( https://www.mail-archive.com/bioc-devel@r-project.org/msg01460.html )
>>> that suggests using the 'GenomicAlignments' package.
>>>
>>>
>> GenomicAlignments is mostly a reorganization of pre-existing
>> functionality.
>> You should be able to call coverage() directly on your GAlignments, or on
>> the BamFile itself, with full support for cigar strings. Also, it is
>> sufficient to call readGAlignments(), i.e., no need for "FromBam".
>>
>>
>>> As far as I can tell that is only available under 2.14, the
>>> development version of bioconductor, which would also require running
>>> R-devel.
>>>
>>> Is there any way to get CIGAR-aware coverage under bioconductor 2.13?
>>>
>>> Thanks in advance,
>>>
>>> Chris Warth
>>> Fred Hutchinson CRC
>>>
>>> P.S. Not to distract from the above question, but I would also like
>>> CIGAR-aware retrieval of alignments. readGAlignmentsFromBam()
>>> retrieves alignments whose gap completely spans the range I am
>>> interested in. I haven't found a way to exclude alignments that
>>> don't have any actual base alignments overlapping the range I asked
>>> for.
>
> You cannot exclude alignments whose gap completely spans the range
> you specify in the 'which' arg of the ScanBamParam object you pass
> to readGAlignmentsFromBam(). However you can get rid of them right
> after readGAlignmentsFromBam() returns with something like:
>
> my_ROI <- GRanges("chr1, IRanges(1964, 2014)) # my region of interest
> gal <- readGAlignmentsFromBam(bamfile, ScanBamParam(which=my_ROI))
> subsetByOverlaps(gal, my_ROI)
Trying this again:
my_ROI <- GRanges("chr1", IRanges(1964, 2014)) # my region of interest
gal <- readGAlignmentsFromBam(bamfile, param=ScanBamParam(which=my_ROI))
subsetByOverlaps(gal, my_ROI)
Hope that works for you.
H.
>
> This subsetting will only keep those alignments that have at least 1
> aligned nucleotide that falls within 'my_ROI'.
>
> Cheers,
> H.
>
>>>
>>> ====
>>>> sessionInfo()
>>> R version 3.0.2 (2013-09-25)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] C
>>>
>>> attached base packages:
>>> [1] parallel stats graphics grDevices utils datasets methods
>>> [8] base
>>>
>>> other attached packages:
>>> [1] wavelets_0.3-0 vegan_2.0-10 lattice_0.20-24
>>> [4] permute_0.8-3 rtracklayer_1.21.14 Rsamtools_1.13.52
>>> [7] Biostrings_2.29.19 BiocInstaller_1.12.0 GenomicRanges_1.14.4
>>> [10] XVector_0.1.4 IRanges_1.20.6 BiocGenerics_0.7.8
>>>
>>> loaded via a namespace (and not attached):
>>> [1] BSgenome_1.29.1 RCurl_1.95-4.1 XML_3.98-1.1 bitops_1.0-6
>>> [5] compiler_3.0.2 grid_3.0.2 stats4_3.0.2 tools_3.0.2
>>> [9] zlibbioc_1.7.0
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list