[Bioc-sig-seq] filtering by adapters in QA report

Martin Morgan mtmorgan at fhcrc.org
Fri Mar 25 16:59:17 CET 2011


On 03/24/2011 10:56 AM, Michael Lawrence wrote:
> Hi Martin,
>
> It would be nice if the ShortRead QA report could somehow filter out the
> adapter contamination before generating the rest of its plots, since those
> plots are pretty meaningless if there are adapters present.
>
> Not sure how to handle this filtering in general. That is, what if someone
> then wants to see plots with only the "high quality" reads after the quality
> plots. It gets complicated. ShortRead has a nice filtering mechanism, but
> this is more complicated, since some QA plots come from one filter, while
> others come from a different stage.
>
> However, under the assumption that no one would ever want to align an
> adapter, i.e., those reads will not be carried forward, the adapter removal
> could just be treated specially hard-coded. And then just expect more
> customized solutions to leverage the internal ShortRead functions for
> generating each slot in the QA object, building it up incrementally, on
> different subsets. Of course, to make sense, that would require a different
> report template, too.

Hi Michael -- Yes it would be nice to be able to more flexibly control 
how different components of the report are generated, or at least to 
make some smarter choices along the lines you suggest for adapter 
contaminants. It's hard to know how to make this really general, but I 
have come across other situations where I'd like to cherry-pick which 
parts of the QA process I want to perform. I think I need some 
standardization on function signatures for generating each report 
section, tighter description of results from each section (i.e., a 
formal class  hierarchy), and then a flexible report composition. It 
seems like quite a big task; I wonder if there are good models out there 
to follow? arrayQualityMetrics?

Martin

>
> Michael
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-sig-sequencing mailing list