[BioC] How to use DESeq to normalize and estimate variance in a RNAseq timecourse analysis
Wolfgang Huber
whuber at embl.de
Thu May 10 21:04:57 CEST 2012
Hi Marie
Simon and you raised the point that comparing each of the five time
points (unreplicated) against control, and then presumably comparing
these lists (for what? overlap?) is likely suboptimal.
While each time point does not have a replicate, if the biological
signal that you are interested in appears and disappears at rates lower
than the sampling time interval, you can still get an idea about some of
the variability in the data, e.g. by fitting a trend and looking at the
residuals. The first thing I would do here, in fact, is to transform the
data on a variance stabilised scale (with DESeq, as described in the
vignette), filter out all genes that show too small variability overall,
and then cluster the patterns. You don't directly get p-values from that
(though with some imagination that can be done), but it might be a lot
more informative than 5 lists.
In any case, having a replicate of the time course seems essential for
reliable inference.
Best wishes
Wolfgang
May/9/12 10:03 PM, Marie Sémon scripsit::
> Dear all,
>
> We are using DESeq to analyse differential expression in a RNAseq
> timecourse analysis (5 time points after treatment + control).
> The dataset contains 3 replicates for the control, and single measures
> for each time point. For each timepoint, we aim to extract differentially
> expressed genes relative to control.
>
> We are wondering what is the best procedure to prepare this dataset for
> this analysis (steps of normalization + variance estimation):
> 1) is it better to start with normalizing + estimating dispersion on the
> whole dataset (5 points + 3 controls), and then to test for differential
> expression in
> the two by two comparisons just mentionned
> 2) or is it better to normalize + estimate dispersion on restricted
> datasets composed of 1 time-point + 3 controls, and then test for
> differential expression between this time point and the controls.
>
> It seems to us that the first procedure is better, because it may be
> less sensitive to outliers. But we would be grateful to have your
> enlightened input.
>
> Thank you very much in advance,
>
> Cheers,
>
> Marie
--
Best wishes
Wolfgang
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
More information about the Bioconductor
mailing list