[BioC] about the quality score
Martin Morgan
mtmorgan at fhcrc.org
Thu Jan 12 14:59:30 CET 2012
On 01/11/2012 01:05 PM, wang peter wrote:
> dear martin:
> the Illumina1.3+(Phred+64) is not Solexa score,
>
> YOU CAN SEE :
>
>
> Score Offset phred ASCII
>
> Sanger 33 0–93 33–126
> Solexa 64 -5–62 59–126
> Illumina1.3+ 64 0–62 64–126
>
>
> if i use solexa function to deal with Illumina1.3+, is it compatible?
In ShortRead, FastqQuality and SFastqQuality determine the _encoding_;
SFastqQuality is appropriate for Solexa and Illumina1.3+. Functions in
ShortRead, e.g., alphabetScore() or as(quality(), "matrix") operate on
the integer value of the corresponding letter. ShortRead does not
(unless I am missing some code) translate the encoding into probabilities.
Biostrings PhredQuality and SolexaQuality also represent encoding, but
allow coercion to numeric, as(<...>, "numeric"). These coercions use -10
log10 (p) for PhredQuality, -10 log10(p / 1-p) for SolexaQuality. The
latter is not appropriate for Illumina1.3+ (although the differences are
most pronounced when p is large, i.e., when reads have low quality
anyway). I will add an additional class IlluminaQuality, to Biostrings
in the 'devel' branch.
Martin
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioconductor
mailing list