[BioC] Typo in ?FastqQuality help page?
Martin Morgan
mtmorgan at fhcrc.org
Mon May 24 00:31:45 CEST 2010
On 05/22/2010 06:39 PM, Peng Yu wrote:
> On Sat, May 22, 2010 at 5:44 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>> On 05/22/2010 03:38 PM, Peng Yu wrote:
>>> Hi Martin,
>>>
>>> '?FastqQuality' leads me to page with the first line 'QualityScore
>>> package:ShortRead R Documentation'
>>>
>>> Then I see,
>>>
>>> Use these functions to construct quality indicators for reads or
>>> alignments. See 'QualityScore' for details of object content and
>>> methods available for manipulating them.
>>> ...
>>> Constructors return objects of the corresponding class derived
>>> from 'QualityScore'.
>>>
>>> ...
>>> 'QualityScore', 'readFastq', 'readAligned'
>>>
>>>
>>> However, when I query the helppage of QualityScore, I got nothing.
>>>
>>>> ?QualityScore
>>> No documentation for 'QualityScore' in specified packages and libraries:
>>> you could try '??QualityScore'
>>
>> ?"QualityScore-class"
>
> I don't see where the explanation of the difference between
> FastqQuality and SFastqQuality is. Is the first one for Sanger the
> second one for Illumina according to the following webpage?
>
> http://en.wikipedia.org/wiki/FASTQ_format
>
> There are totally 4 different Phred score scheme. Would you please let
> which correspond to which class in ShortRead package?
>
> S - Sanger Phred+33, raw reads typically (0, 40)
> X - Solexa Solexa+64, raw reads typically (-5, 40)
> I - Illumina 1.3+ Phred+64, raw reads typically (0, 40)
> J - Illumina 1.5+ Phred+64, raw reads typically (3, 40) with
> 0=unused, 1=unused, 2=Read Segment Quality Control Indicator (bold)
There are two different types of information here, quality score (phred
vs. solexa) and encoding (+33 vs +64). FastqQuality is +33 encoding,
SFastqQuality is +64 encoding. The classes are largely silent about the
underlying interpretation of the score as phred versus solexa quality.
Also ShortRead can arrive at the wrong representation, e.g., when
reading a fastq file which contains quality scores but no indication of
what scale those scores are read on or how they are encoded.
Martin
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list