[BioC] Too many (?) differentially expressed genes - edgeR and DESeq

Darya Vanichkina d.vanichkina at gmail.com
Tue Jul 23 08:32:16 CEST 2013


Hi Wolfgang,

Thank you; this is very likely given the experimental setup of using replicate wells/plates of 1 cell line per condition (the "standard" in the field, which has worried me for quite a while). But I'm afraid I've got more questions:

I thought the edgeR BCV and dispersion were meant to test this very thing, and my values of Disp = 0.0189 , BCV = 0.1375 seem closer to the expected value for a genetically identical model organism/cell line rather than a technical replicate. According to the edgeR manual: "Typical values for the common BCV (square- root-dispersion) for datasets arising from well-controlled experiments are 0.4 for human data, 0.1 for data on genetically identical model organisms or 0.01 for technical replicates."... "The BCV (square root of the common dispersion) here is 14%, a typical size for a laboratory experiment with a cell line or a model organism."

But unlike the edgeR manual, my BCV plot looks "flatter" and smear plot is a lot "redder"...

1) Is the fact that my BCV plot Y-axis is shorter than the one in the manual (ymax(manual) ~= 1.2; ymax(me)~=0.8) evidence of this under-variation?

2) Is there a way to test for whether there is not enough variation happening (apart from looking at the final plots and seeing all that red)? So for example if I do two more cell lines (+1 per condition), how would I then test that I had enough variability, or whether I should do another 2 cell lines? 


3) How would you go about doing the differential expression analysis, since budget/time constraints mean that I can't do the (obviously) preferable experiment of doing more cell lines?

So far what I've done is just filtering based on:
- abs(logFC) > 1
- logCPM > 0.5
- detection by both EdgeR and DESeq

Which are all rather arbitrary cutoffs, and I know I'm losing some biologically meaningful things when I use the fold-change cutoff, since I have a few "expected" genes for the comparison discarded in the non-significant pile. Is there a better way?

Thanks in advance,
Darya


More information about the Bioconductor mailing list