[BioC] Too many (?) differentially expressed genes - edgeR and DESeq

Simon Anders anders at embl.de
Tue Jul 23 11:05:19 CEST 2013


Hi Darya

A dispersion of 0.02 is typical for cell line experiments, but only for 
simple ones. For an experiment involving a full month of incubation, it 
really seems quite low. On the other hand, you do have quite drastic 
changes: very many genes change more the 32-fold (5 log2 units on the MA 
plot), and they would still be significant even if you had a higher 
dispersion estimate.

Given the dramatic changes in phenotype that one sees in 
differentiation, the strong changes in gene expression are not that 
surprising. In the end, it seems entirely reasonable to me to say that 
hardly any gene is at the same level in a stem cell as in a terminally 
differentiated cell.

Remember that a significant p value only means that the gene's fold 
change is not zero and that the observed _direction_ of change is likely 
the true one. It says nothing about the magnitude of the change. You are 
hence in a situation where you are no longer interested in _which_ genes 
change (because the answer simply is: most of them), but in the strength 
of the change: Which genes have changed dramatically, which genes 
stringly and which have changed only a bit? Hence, you should now look 
at fold changes rather than p values.

Using ordinary log2 fold change values can give you a misleading 
picture: As you can see in the MA plot, weak genes seem to have the 
strongest changes, but this is only an artifact due to the fact that for 
weak genes, the fold-change estimates are more variable and hence more 
likely to be exaggerated.

This is why we introduced "shrunken log2 fold changes" in DESeq2: they 
give you a more realistic picture of the strength of changes across the 
dynamic range. See the DESeq2 vignette, and especially this tutorial for 
more explanations:

http://www.bioconductor.org/help/course-materials/2013/CSAMA2013/tuesday/afternoon/DESeq2_parathyroid.pdf

   Simon



More information about the Bioconductor mailing list