[BioC] ShortRead - readAligned() with bowtie & qual
Marc Noguera
mnoguera at imppc.org
Wed Jan 20 12:36:33 CET 2010
Dear list,
I am trying to do some quality assessment on solexa runs using
Bioc&shortreads.
I am using bowtie as a mapper, which yields bowtie-formatted output with
fastq scores for alignment, such as:
> HWUSI-EAS621_91022_1_100_1938_1667 + chr15 53573544
> CAGTCTCCCAAAGTACTGGGATAATAGGTGTGAGACTCC
> DPYWYWYYWWWWPWWYWTVWWYWWWYYWYWXWBBBBBBB 0 34:C>A,36:A>T
> HWUSI-EAS621_91022_1_100_1938_1823 - chr18 34747447
> ACCCGGGAGTTGGGCTGCTTAGTGGCTGGACTCTCTTCC
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 34:T>G
> HWUSI-EAS621_91022_1_100_1938_608 + chr19 35665132
> CAGCTGCTCAGGAGGCTGAGGCAGGAGAATCGCTTGAGC
> DMTTTSRSTUTTTTTUTTTTTTTTTTQSSBBBBBBBBBB 2
> HWUSI-EAS621_91022_1_100_1938_1207 + chr22 30069585
> TCTGGGCCGTGGGGAGGCTCCTCCTTGGCTGATGGCGCC
> DMTUTTRUTPTSTTUUUTSSTTUTBBBBBBBBBBBBBBB 0 35:T>C,37:A>C
> HWUSI-EAS621_91022_1_100_1938_222 - chr20 61020239
> GCCTGGGCCTCCCGAAGTGCTGTGGTTACAGGCATGAGC
> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 2 25:A>G,34:C>G
> HWUSI-EAS621_91022_1_100_1938_1562 + chr15 84916971
> TGGGTTTCACCATGGTGGCCAGGCTGGTCTCAAACTCCT
> DNUVUWWWWWWWWWWWWWUWWWWWWWVBBBBBBBBBBBB 0
> HWUSI-EAS621_91022_1_100_1938_1290 - chr9 120742911
> AGCCCAAGAGAGCCTTCTCCTCGACCATTACCACCAATG
> BBBBBBBBBBBBBBBBSWRLPWUWRSWUKTWXXWXWWND 0 33:C>A,35:T>C
When I try to read this file with the readAligned() function with:
> aln <-
readAligned("/path/",pattern="test.fastq.bwt",type="Bowtie",qualityType='FastqQuality')
I obtain an alignedread object, which includes quality data.
> quality(aln)
> > quality(aln)
> class: SFastqQuality
> quality:
> A BStringSet instance of length 3331015
> width seq
> [1] 35 BBB=B?:AA:@?@>?B@@AA@@A;>@4>>7922=>
> ... ... ...
> [3331015] 33 %%/<<<1;:<<:<<<<995<<<:<::<<<:<<<
However, when I try to use this qualities to plot them I obtain "NA" values
> > alignQuality(aln)
> class: NumericQuality
> quality: NA NA ... NA NA (3331015 total)
So, I guess there is some kind of problem when transforming to ASCII to
quality numerical values. I have also tried with SFastqQuality type to
read the input, with no succes.
What am I doing wrong?
thanks in advance
Marc
--
-----------------------------------------------------
Marc Noguera i Julian, PhD
Genomics unit / Bioinformatics
Institut de Medicina Preventiva i Personalitzada
del Càncer (IMPPC)
B-10 Office
Carretera de Can Ruti
Camí de les Escoles s/n
08916 Badalona, Barcelona
More information about the Bioconductor
mailing list