[BioC] wishlist for readGappedAlignments
Martin Morgan
mtmorgan at fhcrc.org
Tue Aug 23 22:47:07 CEST 2011
Hi Tengfei --
We've acted on some of your suggestions (e.g., incorporating any field
in GappedAlignments, making readBamGappedAlignments accept a
ScanBamParam() when Rsamtools 1.5.53 becomes avaiable), thanks!
We're really interested in hearing from you and others about specific
use cases for paired-end representations, particularly those operations
that are difficult to do with the current GappedAlignments (e.g., the
distribution of insertion sizes could now be easily extracted as
ScanBamParam(what="isize", flag=scanBamFlag(isProperPair=TRUE,
isPrimaryRead=TRUE))).
So please forward any specific use cases!
Martin
On 08/09/2011 11:33 AM, Tengfei Yin wrote:
> Dear all,
>
> I am using GenomicRanges and Rsamtools a lot for my work, they are extremely
> helpful and neat packages to deal with NGS data, thanks a lot for those
> people how contribute to all those nice packages in BioC. I just have some
> features request for the GappedAlignments, probably it's already there or
> it's not a good practice to do it in certain way, please feel free to let me
> know.
>
> I like features from both scanBam or readBamGappedAlignments, just sometime
> I need to write my own script trying to combine information from those two
> function and make a "general" granges to work with. So I am wondering if
> there is any way to do it in a neat way or is there a plan to implement
> similiar features?
>
> - Including more element meta data with GappedAlignments
> - there is "which" in readBamGappedAlignments, can I have some thing
> like "param" or "what" to get more info from bam file and associate them
> with Gapped reads.
> - When doing the coerce from GappedAlignement to GRanges, or call
> granges() on GappedAlignments object, it only return the minimal
> information, "qwidth", "cigar", "ngap" is not included as
> elementMetadata.
> - Including more pairing information for pair-end RNA-seq
> - So I could know the mated information with certain gapped reads,
> either plot it as pair-end read or do some computation on it.
> - Setting flags for each entry, so I can filter it out based on the
> flags, something like from scanBamFlag?
> - grglist to transform the data in different way
>
> If I can get a general data structure which combine all those information
> and or features together, that would be nice, I realize it's hard to
> combine all information together and make it flexible at the same time ,
> e.g. you need to deal with how to binding element meta data for paired
> entry, probably showing seq1/seq2 to indicate which sequence it's belongs
> too? how to handle multiple hits?
>
> Right now, I am making my own "giant" GRanges object which including all the
> information I want, but that's too specific for my work, that's why I am
> wondering if there is any plan to combine those neat features together and
> bring a more flexible data structure.
>
> Thanks!
>
> Tengfei
>
>
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioconductor
mailing list