[BioC] How to use DESeq to normalize and estimate variance in a RNAseq timecourse analysis
Simon Anders
anders at embl.de
Wed May 9 22:12:29 CEST 2012
Hi Marie
> We are wondering what is the best procedure to prepare this dataset for
> this analysis (steps of normalization + variance estimation):
> 1) is it better to start with normalizing + estimating dispersion on the
> whole dataset (5 points + 3 controls), and then to test for differential
> expression in
> the two by two comparisons just mentionned
> 2) or is it better to normalize + estimate dispersion on restricted
> datasets composed of 1 time-point + 3 controls, and then test for
> differential expression between this time point and the controls.
It makes no difference: Only replicated samples contain information
about variance, and hence, DESeq ignores your non-controls anyway when
estimating dispersion.
Of course, having replicates only for control and not for at least some
of the time points after treatment is not a very good design. Unless
your treatment is as reproducible as the control experiment, you will
get an inflated number of false positives. Typically, the treatment
experiment is more involvced than the control experiment (for example,
in a treatment with a drug, the effective dosage or the drug absorption
might vary by some extent) and hence it would be more important to have
treatment than control replicates.
Simon
More information about the Bioconductor
mailing list