[BioC] Recommended gene model for DESeq
Assaf Gordon
gordon at cshl.edu
Fri Apr 6 23:24:56 CEST 2012
Thank you all for your responses.
I'm still looking for the optimal way to count hits for RNA-Seq Paired-end data, may I ask for couple of clarifications?
Simon Anders wrote, On 04/05/2012 05:11 AM:
> You should make sure that each read is counted only once per gene.
Once per gene - got it.
What about a case where a read matches multiple genes? (described as "ambiguous" in HTSeq-Count/GenomicRanges "modes")
Is it OK to count this read several times (once for each gene, multiple different genes), or would that invalidate the results?
It seems "easyRNASeq" will count a read multiple times (once per gene) when using "geneModels" summarization mode (based on [1], page 10) - so can it be used?
> If you want to stay in R: Valerie Obenchain has recently added functionality to GenomicRanges to perform counting in a way similar to my htseq-count script
Related question: handling paired-end data correctly.
It seems only GenomicRanges does not handle paired-end reads at all (based on [2], page 2, section "3. counting mode") - so the only option is "htseq-count" - is that correct ?
I also couldn't find any mention of paired-end data in "easyRNASeq" PDF [1], so I don't know if it handles that or not.
Thanks,
-gordon
[1] http://bioconductor.org/packages/release/bioc/vignettes/easyRNASeq/inst/doc/easyRNASeq.pdf
[2] http://bioconductor.org/packages/release/bioc/vignettes/GenomicRanges/inst/doc/summarizeOverlaps-modes.pdf
More information about the Bioconductor
mailing list