[BioC] BAM files to Genomic Ranges object

Hervé Pagès hpages at fhcrc.org
Thu Jan 10 00:45:33 CET 2013


Dario,

On 01/09/2013 03:00 PM, Dario Strbenac wrote:
>> the first argument is supposed to be the path to the bam file, not to the directory in which the bam file is located.
>
> That's right. Also, note the help page.
>
> path : A character vector of length 1. The path of the BAM file.
>
> Notice there is another method BAM2GRangesList described on the help page for processing multiple BAM file paths into a GRangesList.
>
> Hervé Pagès makes a good point. Although this package isn't for the analysis of RNA-seq data, and the output of Tophat is not relevant, this code was written when everyone was using the original Bowtie to do ChIP-seq alignments, which doesn't perform gapped alignments.

Note that it's not just the gaps (i.e. N letters) that have an impact
on the widths of the genomic ranges, but also the indels (I and D
letters). And I think that Bowtie, like probably most aligners (even
the early ones), introduces indels in the alignments.

However, the impact of the indels is typically less than the impact of
the gaps: in the range of 1-10 nucleotides for the former, hundreds of
nucleotides for the latter. But still, depending on the kind of
downstream analysis you're doing, being off by just 1 nucleotide can
have big consequences (e.g. when translating).

Not really worth taking that risk when it's easy to get this exactly
right ;-)

Cheers,
H.


> This will be updated.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list