[BioC] heatmap with variance stabilizing transformed expression data in DESeq
Simon Anders
anders at embl.de
Thu May 9 12:06:16 CEST 2013
Hi Steve
>> These 500 four FC genes are subset from 4018 genes which are detected as DE in edgeR with FDR < 0.05. So, it is both above four FC and with critical FDR values.
>
> In this case, I retract my bet listed as my original (5) point from
> the first email :-)
Don't be so fast there. A log2 fold change of 4 is a very extreme
change. Only few genes, likely much less than 500 (depending on type of
experiment, of course), will have such a large fold change. I guess that
most of the gene have low counts, so that their fold changes are
exaggerated. Their true fold changes are large enough to cause a
significant signal, but only because the counts are so low, they also
surpass the threshold.
This is why your recommendation still stands: First, perform the VST,
then chose the 500 genes significant genes with the largest log-fold
changes _after_ VST, and plot the heatmap of these.
Here, you now have the awkward combination of a criterion on p values,
working on the raw data, and one on fold-changes, working on the
shrunken data.
Improving on this was one of the motivations for developing DESeq2's
coefficient shrinkage functionality, so doing the whole analysis in
DESeq2 should prove much more natural.
Simon
More information about the Bioconductor
mailing list