[BioC] Amplicon and exon level read counts and GC content
Martin Morgan
mtmorgan at fhcrc.org
Thu Jun 7 15:08:42 CEST 2012
On 06/06/2012 09:53 PM, Yu Chuan Tai wrote:
> Hi Martin,
>
> More questions on your approaches below. If my BAM files are
> generated by Bowtie2 on pair-end fastq files, for scanBamFlag(),
> should I set isPaired=TRUE? Do I need to worry about other input
> arguments for scanBamFlag() or ScanBamParam(), if I want to
> calculate coverage properly?
It really depends on what you're interested in doing; see for instance
the post by Herve the other day
https://stat.ethz.ch/pipermail/bioconductor/2012-June/046052.html
>
> Also, summarizeOverlaps() doesn't seem to handle paired-end reads.
> How to get around this, or it won't affect coverage calculation?
There is better support for paired-end reads in the 'devel' version of
Biocondcutor; see
http://bioconductor.org/developers/useDevel/
whether and what aspects of paired-endedness are important depends on
how you are using your coverage.
>
> Finally, is there any way to calculate base-specific coverage at any
> genomic locus or interval in Rsamtools? Thanks!
I tried to answer this in your other post.
Martin
>
> Best, Yu Chuan
>
>> More specifically, after
>>
>> library(Rsamtools) example(scanBam) # defines 'fl', a path to a
>> bam file
>>
>> for a _single_ genomic range
>>
>> param = ScanBamParam(what="seq", which=GRanges("seq1",
>> IRanges(100, 500))) dna = scanBam(fl, param=param)[[1]][["seq"]]
>> length(dna) # 365 reads overlap region alphabetFrequency(dna,
>> collapse=TRUE, baseOnly=TRUE) # 2838 + 3003 GC
>>
>> though you'd likely want to specify several regions (vector
>> arguments to GRanges) and think about flags (scanBamFlag() and the
>> flag argument to ScanBamParam), read mapping quality, reads
>> overlapping more than one region, etc. (summarizeOverlaps
>> implements several counting strategies, but it is 'easy' to
>> implement arbitrary approaches).
>>
>>>
>>> Martin
>>>
>>>>
>>>> Thanks for any input!
>>>>
>>>> Best, Yu Chuan
>>>>
>>>> _______________________________________________ Bioconductor
>>>> mailing list Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
>>>> archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>
>>
>>
>>>>
>>>>
--
>> Computational Biology Fred Hutchinson Cancer Research Center 1100
>> Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>
>> Location: M1-B861 Telephone: 206 667-2793
>>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list