[BioC] a program to force fastq using pred+33 scoring system
Martin Morgan
mtmorgan at fhcrc.org
Mon May 19 19:23:25 CEST 2014
On 05/19/2014 06:49 AM, Wang Peter wrote:
> i am sorry, i have another question about the coding
> phred64 = rawToChar(as.raw(64:104))
> phred33 = rawToChar(as.raw(33:73))
> qual = chartr(phred64, phred33, quality(quality(reads)))
> how did it deal with quality score from 59 to 63?
I did not include them; expand the phred64 and phred33 strings to reflect the
desired mapping.
Martin
>
>
> On Mon, May 19, 2014 at 9:32 PM, Wang Peter <wng.peter at gmail.com
> <mailto:wng.peter at gmail.com>> wrote:
>
> thank you very much
> you coding is so beautiful, mine is ugly
>
>
> On Mon, May 19, 2014 at 8:03 PM, Martin Morgan <mtmorgan at fhcrc.org
> <mailto:mtmorgan at fhcrc.org>> wrote:
>
> On 05/14/2014 02:56 PM, Wang Peter wrote:
>
> i worte a program to automatically dectect the scoring system in
> fastq file
> and if pred+64, it should be changed into pred+33
>
>
> I would make the relevant alphabets
>
> phred64 = rawToChar(as.raw(64:104))
> phred33 = rawToChar(as.raw(33:73))
>
> translate the character representation of the underling BStringSet
>
> qual = chartr(phred64, phred33, quality(quality(reads)))
>
> and tag the result as phred+33
>
> ShortReadQ(sread(reads), FastqQuality(qual))
>
> To get a numeric representation of the quality encoding, I would
>
> as(quality(reads), "numeric")
>
> or for a rectangular, NA-padded representation
>
> as(quality(reads), "matrix")
>
> Martin
>
>
> please help me make sure it is right
>
> rm(list=ls())
> fastqfile="1.fastq";
> working_dir="d:/working_dir"
> setwd(working_dir)
> library(ShortRead);
> reads <- readFastq(fastqfile);
> seqs <- sread(reads);
> score_sys = data.class(quality(reads));
> cat("the quality score system
> (SFastqQuality=Phred+64,__FastqQuality=Phred+33) is",score_sys,"\n")
> raw_len <- max(width(reads));
>
> qual <- quality(quality(reads));
> myqual_16L <- charToRaw(as.character(unlist(__qual)));
> if(score_sys =="FastqQuality")
> {
> myqual_10L <- strtoi(myqual_16L,16L)-33; # this is for other use
> }
>
> if(score_sys =="SFastqQuality")
> {
> myqual_10L <- strtoi(myqual_16L,16L)-64;
> qual_temp <- PhredQuality (as.integer(myqual_10L));
> qual <- BStringSet(unlist(qual_temp), start= seq(from = 1, to =
> raw_len*(length(reads)-1)+1, by = raw_len), width=raw_len);
>
> }
> reads <- ShortReadQ(sread=seqs, quality=qual ) );
>
>
>
> --
> Computational Biology / Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N.
> PO Box 19024 Seattle, WA 98109
>
> Location: Arnold Building M1 B861
> Phone: (206) 667-2793 <tel:%28206%29%20667-2793>
>
>
>
>
> --
> shan gao
> Room 231(Dr.Fei lab)
> Boyce Thompson Institute for Plant Research
> Cornell University
> Tower Road, Ithaca, NY 14853-1801
> Office phone: 1-607-254-1267 <tel:1-607-254-1267>(day)
> Official email:sg839 at cornell.edu <mailto:email%3Asg839 at cornell.edu>
> Facebook:http://www.facebook.com/profile.php?id=100001986532253
>
>
>
>
> --
> shan gao
> Room 231(Dr.Fei lab)
> Boyce Thompson Institute for Plant Research
> Cornell University
> Tower Road, Ithaca, NY 14853-1801
> Office phone: 1-607-254-1267(day)
> Official email:sg839 at cornell.edu <mailto:email%3Asg839 at cornell.edu>
> Facebook:http://www.facebook.com/profile.php?id=100001986532253
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list