[BioC] heatmap with variance stabilizing transformed expression data in DESeq

Simon Anders anders at embl.de
Thu May 9 12:06:16 CEST 2013


Hi Steve

>> These 500 four FC genes are subset from 4018 genes which are  detected as DE in edgeR with FDR < 0.05. So, it is both above four FC and with critical FDR values.
>
> In this case, I retract my bet listed as my original (5) point from
> the first email :-)

Don't be so fast there. A log2 fold change of 4 is a very extreme 
change. Only few genes, likely much less than 500 (depending on type of 
experiment, of course), will have such a large fold change. I guess that 
most of the gene have low counts, so that their fold changes are 
exaggerated. Their true fold changes are large enough to cause a 
significant signal, but only because the counts are so low, they also 
surpass the threshold.

This is why your recommendation still stands: First, perform the VST, 
then chose the 500 genes significant genes with the largest log-fold 
changes _after_ VST, and plot the heatmap of these.

Here, you now have the awkward combination of a criterion on p values, 
working on the raw data, and one on fold-changes, working on the 
shrunken data.

Improving on this was one of the motivations for developing DESeq2's 
coefficient shrinkage functionality, so doing the whole analysis in 
DESeq2 should prove much more natural.

   Simon



More information about the Bioconductor mailing list