[BioC] wishlist for readGappedAlignments

Martin Morgan mtmorgan at fhcrc.org
Tue Aug 23 22:47:07 CEST 2011

Hi Tengfei --

We've acted on some of your suggestions (e.g., incorporating any field 
in GappedAlignments, making readBamGappedAlignments accept a 
ScanBamParam() when Rsamtools 1.5.53 becomes avaiable), thanks!

We're really interested in hearing from you and others about specific 
use cases for paired-end representations, particularly those operations 
that are difficult to do with the current GappedAlignments (e.g., the 
distribution of insertion sizes could now be easily extracted as 
ScanBamParam(what="isize", flag=scanBamFlag(isProperPair=TRUE, 

So please forward any specific use cases!


On 08/09/2011 11:33 AM, Tengfei Yin wrote:
> Dear all,
> I am using GenomicRanges and Rsamtools a lot for my work, they are extremely
> helpful and neat packages to deal with NGS data, thanks a lot for those
> people how contribute to all those nice packages in BioC. I just have some
> features request for the GappedAlignments, probably it's already there or
> it's not a good practice to do it in certain way, please feel free to let me
> know.
> I like features from both scanBam or readBamGappedAlignments,  just sometime
> I need to write my own script trying to combine information from those two
> function and make a "general" granges to work with. So I am wondering if
> there is any way to do it in a neat way or is there a plan to implement
> similiar features?
>     - Including more element meta data with GappedAlignments
>        - there is "which" in readBamGappedAlignments, can I have some thing
>        like "param" or "what" to get more info from bam file and associate them
>        with Gapped reads.
>        - When doing the coerce from GappedAlignement to GRanges, or call
>        granges() on GappedAlignments object, it only return the minimal
>        information, "qwidth", "cigar", "ngap" is not included as
> elementMetadata.
>     - Including more pairing information for pair-end RNA-seq
>        - So I could know the mated information with certain gapped reads,
>        either plot it as pair-end read or do some computation on it.
>        - Setting flags for each entry, so I can filter it out based on the
>        flags, something like from scanBamFlag?
>        - grglist to transform the data in different way
> If I can get a general data structure which combine all those information
> and or features together, that would be nice, I realize it's hard  to
> combine all information together and make it flexible at the same time ,
>   e.g. you need to deal with how to binding element meta data for paired
> entry, probably showing seq1/seq2 to indicate which sequence it's belongs
> too? how to handle multiple hits?
> Right now, I am making my own "giant" GRanges object which including all the
> information I want, but that's too specific for my work, that's why I am
> wondering if there is any  plan to combine those neat features together and
> bring a more flexible data structure.
> Thanks!
> Tengfei

Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

More information about the Bioconductor mailing list