[Bioc-sig-seq] more operations on BamViews

Vincent Carey stvjc at channing.harvard.edu
Wed Mar 2 18:44:32 CET 2011


On Wed, Mar 2, 2011 at 9:58 AM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> On 03/01/2011 04:44 AM, Michael Lawrence wrote:
>> Hi guys,
>>
>> What are the plans for the BamViews class. It looks like a useful
>> foundation. One thing that would be good to have in R is a way to calculate
>> "pileups" or base tallies for positions of interest. These counts could be
>> broken down by sample (bamfile), cycle (position in the read), etc. Results
>> returned as a DataFrame (in a format like that returned by as.data.frame on
>> a table) that could be aggregated() up as desired. Rles would save memory.
>> So there could be something like a alphabetFrequency() method for BamViews.
>> This is related to Steve's recent work with counting over XStringSets.
>
> Hi Michael -- BamViews is definitely open for more development. The
> methods currently implemented (minimal!) basically dispatch to
> single-bam variants. And I guess there is no single-bam variant of what
> you're looking for.
>
> Another possibility is to expose more of samtools, e.g., pileup /
> mpileup, which might be returned more or less directly for manipulation
> in R, or summarized. I'll work on this in the 3 week time frame (sorry)

exposition of pileup/mpileup was what occurred to me also.  i would
hope it is not
premature to express some concern with the downstream container for
the outputs of
these things.  we have a pileup-output parser which delivers a GRanges and that
is probably adequate, although decoding the pileup string might be a
useful added value.

mpileup delivers VCF/BCF and while we can scan these,
some of the structures returned can only be interpreted by checking
some file specification
and it would be good to have some downstream data modeling based on
use cases, that the
mpileup interface could target.  such developments could be important
for the ISMB tutorial
so i will be thinking more about this in coming weeks.

>
> Maybe Herve will weigh in on Steve's XStringSet sliding window
> letterFrequencyAt
>
> Martin
>
>>
>> Surely there are many other features that could be added. The above is just
>> one that I would use often, across a number of contexts.
>>
>> Thanks,
>> Michael
>>
>>       [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>



More information about the Bioc-sig-sequencing mailing list