[BioC] How to use DESeq to normalize and estimate variance in a RNAseq timecourse analysis

Fri May 11 16:12:41 CEST 2012

Hi Marie

On 05/11/2012 02:52 PM, Marie Sémon wrote:
> We wanted first to determine a set of genes for which expression level
> differs statistically between at least one time point and the controls,
> because we need to estimate the whole set of genes regulated at some
> point or the other by the treatment. This is why we compared
> sequentially Ctr/T1 , Ctr/T2, Ctr/T3 etc... and then took the union of
> these five lists. We performed this kind of analysis because we thought
> that in DESeq it is not possible to test wether a gene is deregulated
> over the whole time series experiment. But perhaps are we wrong here?

Actually, yes. You can test against the null hypothesis "The gene is the 
same as in control in _all_ treatment time points."

To this end, use GLMs as follows (after the dispersion estimnation):

fit1 <- fitNbinomGLMs( cds, count ~ condition )
fit0 <- fitNbinomGLMs( cds, count ~ 1 )
pvals <- nbinomGLMTest( fit1, fit0 )
padj <- p.adjust( pval, method="BH" )

The questions is how useful this is, because, I imagine that very many 
genes will react at some point unless your treatment is quite mild.
Such a long gene list, without any means of subdividing it further, is 
usually not that helpful too reach biological conclusions. This is why, 
for time courses, a clustering analysis (as Wolfgang suggested) is often 
more useful than testing for differential expression.

   Simon