[BioC] ShortRead Per Cycle Score
mtmorgan at fhcrc.org
mtmorgan at fhcrc.org
Tue Apr 27 21:26:06 CEST 2010
Hi Dario --
Quoting Dario Strbenac <D.Strbenac at garvan.org.au>:
> Hello again,
>
> I'm now stuck on recreating the per cycle scores from the HTML report graph.
>
> When I do the plot, I get the typical decline to the right and the
> scores go from about 32 at the left to 22 at the right.
>
> QAaligned <- qa("myPath/s_6_1", "uniq.map.gz", "Bowtie", qualityType
> = "SFastqQuality")
> pcq <- QAaligned[["perCycle"]][["quality"]]
> ShortRead:::.plotCycleQuality(pcq)
>
> But then when I calculate the scores by myself I get a very different trend.
>
> sbCyc <- split(pcq, pcq$Cycle)
>
> avgScores <- sapply(sbCyc, function(cycTable){
> totalCounts <- sum(cycTable[, "Count"])
> totalScore = 0
> for(index in 1:nrow(cycTable))
> {
> totalScore = totalScore + (cycTable[index, "Score"] *
> cycTable[index, "Count"])
> }
> avg <- totalScore / totalCounts
> return(avg)
or just
with(cycTable, sum(Score * Count) / sum(Count))
The underlying problem is that Score was not being calculated
correctly (report() did not use Score when generating the figure);
this has been corrected in both release and devel branches, along with
an issue Nicholas Delhomme pointed out about transposed labeling of
'aligned' versus 'filtered' reads for type=SolexaExport, version *.*.1.
Sorry for the confusion.
Martin
> })
>> avgScores
> 1 2 3 4 5 6
> 7 8 9 10 11 12 13
> 14 15 16 17
> 6.663533 7.743573 7.894290 7.752292 7.736450 7.667488
> 7.602862 7.589947 7.452226 7.576258 7.454091 7.204792 7.396363
> 7.282941 8.065390 11.513014 12.371052
> 18 19 20 21 22 23
> 24 25 26 27 28 29 30
> 31 32 33 34
> 12.073634 11.947487 12.039822 11.890664 12.277044 12.662244
> 12.558223 13.296835 12.045379 12.147517 12.305975 10.955502
> 11.982833 11.913569 11.072581 10.821556 12.029892
> 35 36
> 14.325464 13.316201
>
> It increases, and ranges from 6 to 13. I can't see a typo, so am I
> missing some sort of mathematical transformation somewhere ?
>
> Thanks.
>
> --------------------------------------
> Dario Strbenac
> Research Assistant
> Cancer Epigenetics
> Garvan Institute of Medical Research
> Darlinghurst NSW 2010
> Australia
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list