[BioC] DESeq on transcripts v/s genes
Wolfgang Huber
whuber at embl.de
Sun Feb 5 15:59:31 CET 2012
A clarification (after off-list request): there are two possibilties for
double counting, and with below post I'm refering to only one of them:
1. Creating a transcript-level count for each possible transcript of a
gene, essentially by *treating each transcript as a separate 'gene'*,
and then calling DESeq or analgous. This is what the below post refers to.
2. Counting the reads touching each exon, and then *summing these
numbers up over all exons of a gene* to get a per-gene (or per
transcript) value. That would be wrong, since then those reads that
touch more than one exon are multiply counted and mess up the
statistical model.
Best wishes
Wolfgang
Feb/5/12 12:16 PM, Wolfgang Huber scripsit::
> Dear Abishek
>
> there was some anxiety regarding double-counting / redundancy in this
> thread. Actually, there is very little reason to worry. DESeq tests
> sequentially one hypothesis after the other. It does not matter whether
> they are correlated or not.
>
> The one consideration where the correlations / redundancy can matter is
> multiple testing correction. As long as you go for FDR, again there is
> little to worry, since the redundancy pops up both in the numerator and
> denominator of the ratio (the "R" in FDR) and at least to good enough
> approximation cancels out.
>
> If you go for family-wise error rate (FWER) and, say, Bonferroni
> correction, then the redundancy and the increase in number of tests do
> matter. But there seem few reasons to use FWER/Bonferroni in this context.
>
> Hope this helps
> Wolfgang
>
> Feb/2/12 12:46 AM, Abhishek Pratap scripsit::
>> Hi All
>>
>> I am wondering if conceptually I can use the DESeq to test for
>> differential
>> transcript expression compared to genes. In our case we have generated a
>> transcript model based on RNA-Seq and if we try to collapse those
>> transcripts to genes in order to do gene level differential expression
>> many
>> exons are collapsed to give rise to artificial exons.
>>
>>
>> eg :
>>
>>
>> Transcript 1 : ---------------------- (exon)
>> Transcript 2 : -----------------------------(exon )
>>
>> Gene level : -------------------------------------------- (exon)
>>
>> Also another thing that comes to my mind if the effect of double counting
>> if I take the read counts at transcript level due to exon redundancy.
>>
>> I would love to hear from your experience.
>>
>> Thanks!
>> -Abhi
>>
>> [[alternative HTML version deleted]]
>>
>
> Best wishes
> Wolfgang
>
> Wolfgang Huber
> EMBL
> http://www.embl.de/research/units/genome_biology/huber
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Best wishes
Wolfgang
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
More information about the Bioconductor
mailing list