[BioC] shortread base quality

Martin Morgan mtmorgan at fhcrc.org
Fri Aug 24 18:30:56 CEST 2012


On 08/24/2012 07:09 AM, David martin wrote:
> Hi,
> I'm trying to get the quality scores for each nucleotide for each read
> from the fastq file.
> Here is how it starts. I know to get average scores for reads but not
> for each individual nucleotide of each read.
>
>
>
> file <- "file.fastq"
> fqfile <- paste(basename(file),"",sep="")
> path <- dirname(file)
> sp <- SolexaPath(path,dataPath=path,analysisPath=path)
> fq <- readFastq(sp, fqfile)
>
> #Get quality scores
> score <- alphabetScore(fq)
>
> #Gives the sum of the base quality scores for each read
> aveScore <- score / width(fq)

try alphabetFrequency, e.g.,

   ragged = narrow(quality(rfq), 1, width=c(20, 30)) ## recycle width
   alf = alphabetFrequency(ragged)

 > dim(alf)
[1] 256  94
 > alf[1:5, 1:5] # not very exciting at this end...
        ! " # $
[1,] 0 0 0 0 0
[2,] 0 0 0 0 0
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0


Martin

>
> #How can i get the score for each base for each read ????
>
> thanks,
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list