[BioC] ShortRead - readAligned() with bowtie & qual
Kasper Daniel Hansen
khansen at stat.berkeley.edu
Wed Jan 20 14:38:18 CET 2010
alignQuality is not the same as quality.
quality is the qualities of the reads (which you are interested in). alignQuality is the quality if the _alignment_, which Bowtie does not give (one could say that a perfect match alignment is better than a 1 mismatch alignment and so on). You should also have noticed that alignQuality is a vector of numeric, but that there is only one element per read, whereas the qualities have one element per read per base.
So you need to operate on quality(aln)
Kasper
On Jan 20, 2010, at 6:36 AM, Marc Noguera wrote:
> Dear list,
> I am trying to do some quality assessment on solexa runs using
> Bioc&shortreads.
> I am using bowtie as a mapper, which yields bowtie-formatted output with
> fastq scores for alignment, such as:
>> HWUSI-EAS621_91022_1_100_1938_1667 + chr15 53573544
>> CAGTCTCCCAAAGTACTGGGATAATAGGTGTGAGACTCC
>> DPYWYWYYWWWWPWWYWTVWWYWWWYYWYWXWBBBBBBB 0 34:C>A,36:A>T
>> HWUSI-EAS621_91022_1_100_1938_1823 - chr18 34747447
>> ACCCGGGAGTTGGGCTGCTTAGTGGCTGGACTCTCTTCC
>> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0 34:T>G
>> HWUSI-EAS621_91022_1_100_1938_608 + chr19 35665132
>> CAGCTGCTCAGGAGGCTGAGGCAGGAGAATCGCTTGAGC
>> DMTTTSRSTUTTTTTUTTTTTTTTTTQSSBBBBBBBBBB 2
>> HWUSI-EAS621_91022_1_100_1938_1207 + chr22 30069585
>> TCTGGGCCGTGGGGAGGCTCCTCCTTGGCTGATGGCGCC
>> DMTUTTRUTPTSTTUUUTSSTTUTBBBBBBBBBBBBBBB 0 35:T>C,37:A>C
>> HWUSI-EAS621_91022_1_100_1938_222 - chr20 61020239
>> GCCTGGGCCTCCCGAAGTGCTGTGGTTACAGGCATGAGC
>> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 2 25:A>G,34:C>G
>> HWUSI-EAS621_91022_1_100_1938_1562 + chr15 84916971
>> TGGGTTTCACCATGGTGGCCAGGCTGGTCTCAAACTCCT
>> DNUVUWWWWWWWWWWWWWUWWWWWWWVBBBBBBBBBBBB 0
>> HWUSI-EAS621_91022_1_100_1938_1290 - chr9 120742911
>> AGCCCAAGAGAGCCTTCTCCTCGACCATTACCACCAATG
>> BBBBBBBBBBBBBBBBSWRLPWUWRSWUKTWXXWXWWND 0 33:C>A,35:T>C
> When I try to read this file with the readAligned() function with:
>> aln <-
> readAligned("/path/",pattern="test.fastq.bwt",type="Bowtie",qualityType='FastqQuality')
>
> I obtain an alignedread object, which includes quality data.
>> quality(aln)
>>> quality(aln)
>> class: SFastqQuality
>> quality:
>> A BStringSet instance of length 3331015
>> width seq
>> [1] 35 BBB=B?:AA:@?@>?B@@AA@@A;>@4>>7922=>
>> ... ... ...
>> [3331015] 33 %%/<<<1;:<<:<<<<995<<<:<::<<<:<<<
> However, when I try to use this qualities to plot them I obtain "NA" values
>>> alignQuality(aln)
>> class: NumericQuality
>> quality: NA NA ... NA NA (3331015 total)
> So, I guess there is some kind of problem when transforming to ASCII to
> quality numerical values. I have also tried with SFastqQuality type to
> read the input, with no succes.
>
> What am I doing wrong?
>
> thanks in advance
> Marc
>
> --
>
> -----------------------------------------------------
> Marc Noguera i Julian, PhD
> Genomics unit / Bioinformatics
> Institut de Medicina Preventiva i Personalitzada
> del Càncer (IMPPC)
> B-10 Office
> Carretera de Can Ruti
> Camí de les Escoles s/n
> 08916 Badalona, Barcelona
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list