[Bioc-sig-seq] locally aligned reads

Hervé Pagès hpages at fhcrc.org
Tue Mar 23 18:35:46 CET 2010


Hi Hans-Ulrich,

An update on this. We've recently added the GenomicRanges package
to BioC devel.

   Description: The ability to efficiently store genomic annotations and
         alignments is playing a central role when it comes to analyze
         high-throughput sequencing data (a.k.a. NGS data). The package
         defines general purpose containers for storing genomic intervals
         as well as more specialized containers for storing alignments
         against a reference genome.

In particular it provides a class (GappedAlignments) that is suitable
for storing alignments with deletions, insertions and/or gaps. Note
that it's a light-weight class in the sense that it doesn't store the
query sequences. Right now the mismatch information isn't stored either
but we might consider adding it in the future based on the feedback we
receive.

Please give GenomicRanges a try when you get a chance. It's still very
much a work-in-progress (as the version indicates -- 0.0.9) but you
can already perform some useful operations like coverage(),
findOverlaps(), etc... on your GappedAlignments/GRanges/GRangesList
objects.

Your feedback we'll be very appreciated.

Cheers,
H.

PS: At this moment you still need to install Rsamtools directly from
Subversion in order to be able to load a BAM file into a
GappedAlignments object with readGappedAlignments().


Martin Morgan wrote:
> On 02/12/2010 06:28 AM, Patrick Aboyoun wrote:
>> Hans-Ulrich,
>> We at the Hutch are currently adding support for alignments with indels
>> (I/D/S) and gaps (N) to Rsamtools in time for the BioC 2.6 release. What
>> functionality would you consider to be a must have for the BioC 2.6
>> release? We don't have a full spec for the release and your input, and
>> that of others, would be of great value to us.
>>
>>
>> Patrick
>>
>> On 2/12/10 2:38 AM, Hans-Ulrich Klein wrote:
>>> Dear All,
>>>
>>> I am working with reads that have local alignments. That means, that
>>> the reads only partially align to the reference. A read may also have
>>> two or more subsequences that align to different positions. The BioC
>>> class AlignedRead is not suitable in this situation. Also the
>>> Rsamtools package does not support reads that have e.g. "D", "I" or
>>> "S" within their cigar strings.
> 
> For what it's worth, Rsamtools scanBam does currently input these reads;
> the cigar is a factor and interpretable by the end user; as Patrick says
> things are in the works for much richer support. Martin
> 
> 
>>> I would like to know whether someone has already implemented or
>>> intends to add some functionality for local alignments.
>>>
>>> Best wishes,
>>> Hans-Ulrich
>>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> 
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-sig-sequencing mailing list