[BioC] How to use DESeq to normalize and estimate variance in a RNAseq timecourse analysis

Simon Anders anders at embl.de
Wed May 9 22:12:29 CEST 2012

Hi Marie

> We are wondering what is the best procedure to prepare this dataset for
> this analysis (steps of normalization + variance estimation):
> 1) is it better to start with normalizing + estimating dispersion on the
> whole dataset (5 points + 3 controls), and then to test for differential
> expression in
> the two by two comparisons just mentionned
> 2) or is it better to normalize + estimate dispersion on restricted
> datasets composed of 1 time-point + 3 controls, and then test for
> differential expression between this time point and the controls.

It makes no difference: Only replicated samples contain information 
about variance, and hence, DESeq ignores your non-controls anyway when 
estimating dispersion.

Of course, having replicates only for control and not for at least some 
of the time points after treatment is not a very good design. Unless 
your treatment is as reproducible as the control experiment, you will 
get an inflated number of false positives. Typically, the treatment 
experiment is more involvced than the control experiment (for example, 
in a treatment with a drug, the effective dosage or the drug absorption 
might vary by some extent) and hence it would be more important to have 
treatment than control replicates.


More information about the Bioconductor mailing list