[BioC] Amplicon and exon level read counts and GC content

Yu Chuan Tai yuchuan at stat.berkeley.edu
Sat Jun 9 04:18:00 CEST 2012


Hi Martin,

A quick question. Does it matter if I don't have the strand info. for the 
interval that I am interested in, when I specify the input arguments for 
ScanBamParam()?

Best,
Yu Chuan

On Thu, 7 Jun 2012, Martin Morgan wrote:

> On 06/06/2012 09:53 PM, Yu Chuan Tai wrote:
>> Hi Martin,
>> 
>> More questions on your approaches below. If my BAM files are
>> generated by Bowtie2 on pair-end fastq files, for scanBamFlag(),
>> should I set isPaired=TRUE? Do I need to worry about other input
>> arguments for scanBamFlag() or ScanBamParam(), if I want to
>> calculate coverage properly?
>
> It really depends on what you're interested in doing; see for instance the 
> post by Herve the other day
>
>  https://stat.ethz.ch/pipermail/bioconductor/2012-June/046052.html
>
>> 
>> Also, summarizeOverlaps() doesn't seem to handle paired-end reads.
>> How to get around this, or it won't affect coverage calculation?
>
> There is better support for paired-end reads in the 'devel' version of
> Biocondcutor; see
>
>  http://bioconductor.org/developers/useDevel/
>
> whether and what aspects of paired-endedness are important depends on how you 
> are using your coverage.
>
>> 
>> Finally, is there any way to calculate base-specific coverage at any
>> genomic locus or interval in Rsamtools? Thanks!
>
> I tried to answer this in your other post.
>
> Martin
>
>> 
>> Best, Yu Chuan
>> 
>>> More specifically, after
>>> 
>>> library(Rsamtools) example(scanBam) # defines 'fl', a path to a
>>> bam file
>>> 
>>> for a _single_ genomic range
>>> 
>>> param = ScanBamParam(what="seq", which=GRanges("seq1",
>>> IRanges(100, 500))) dna = scanBam(fl, param=param)[[1]][["seq"]]
>>> length(dna) # 365 reads overlap region alphabetFrequency(dna,
>>> collapse=TRUE, baseOnly=TRUE) # 2838 + 3003 GC
>>> 
>>> though you'd likely want to specify several regions (vector
>>> arguments to GRanges) and think about flags (scanBamFlag() and the
>>> flag argument to ScanBamParam), read mapping quality, reads
>>> overlapping more than one region, etc. (summarizeOverlaps
>>> implements several counting strategies, but it is 'easy' to
>>> implement arbitrary approaches).
>>> 
>>>> 
>>>> Martin
>>>> 
>>>>> 
>>>>> Thanks for any input!
>>>>> 
>>>>> Best, Yu Chuan
>>>>> 
>>>>> _______________________________________________ Bioconductor
>>>>> mailing list Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
>>>>> archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>>>> 
>>>>> 
> -- 
>>> Computational Biology Fred Hutchinson Cancer Research Center 1100
>>> Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>> 
>>> Location: M1-B861 Telephone: 206 667-2793
>>> 
>
>
> -- 
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793
>



More information about the Bioconductor mailing list