[BioC] report a possble bug in shortread
Martin Morgan
mtmorgan at fhcrc.org
Fri Nov 30 05:32:37 CET 2012
Hi -- If I make a plain text file 'test.fq' with just these four lines in it
> id
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
@ id
HHHHHHHHHHHHHHHHHHGHHHHHHDDCDD#?CCBCDDDCHEHHHHHHHHHHHFH7
and then read it in I get FastqQuality
> data.class(quality(readFastq("/tmp/tmp.fq")))
[1] "FastqQuality"
Do you? The algorithm uses a heuristic and looks at the first 1000 reads
alf <- alphabetFrequency(head(quality, 1000),
collapse = TRUE)
if (any(alf) && min(which(alf != 0)) < 59) {
FastqQuality
} else SFastqQuality
so if you have high-quality reads initially it will guess wrong; I'll increase
the number of reads looked at.
Martin
On 11/29/2012 07:48 PM, Wang Peter wrote:
> i read the fastq file and let the function identify the quality score
>
> if(iteration==1)#
> {
> score_sys = data.class(quality(reads));
> }
> if(score_sys =="SFastqQuality")#Phred+64
> {
> do something
> }
> if(score_sys =="FastqQuality")#Phred+33
> {
> do something
> }
>
>
> but the score_sys=SFastqQuality, the data is Phred+33
>
> see the quality line
> HHHHHHHHHHHHHHHHHHGHHHHHHDDCDD#?CCBCDDDCHEHHHHHHHHHHHFH7
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list