[Bioc-sig-seq] Quality Value Analysis from a BStringSet

Pratap, Abhishek APratap at som.umaryland.edu
Thu Jun 3 21:39:32 CEST 2010


Hi All

I would like to extract and count the last 5 quality values from the FASTQ file. I have read the file using "readFastq" and have stored the quality values as a BStringSet.

Eg :
A BStringSet instance of length 5119916
          width seq
      [1]    75 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
      [2]    75 bbbbbbbbbbbbabbbbbb`bbbbbbab`b_...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
      [3]    75 aaaaaaa_aaaaO`aa^aaa_a_T_``^[`S...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
      [4]    75 bbbbbbbbbbbbaabbbb`bbb_Uaa___BB...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
      [5]    75 ``a`aa`aaYaTaaaBBBBBBBBBBBBBBBB...BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB

What I would like to do is subseq the last 5 quality values and do a count on #B. We suspect despite good avg quality we still have HIGH bad bases at the end of reads.

Any other ideas welcome.

Thanks!
-Abhi



More information about the Bioc-sig-sequencing mailing list