I have normalized my RNAseq read counts in EdgeR. If I try to look at the normalized data:

total=cpm(w, normalized.lib.size=TRUE)
And specifically the CPM values for my cell line 1 (in triplicates) for the gene Hspa5

total["Hspa5",1:3]
7492.944 6750.397 5727.190

If I find the fold change between cell line 1 and my control cell line 2 I get:

total["Hspa5",1:3]/total["Hspa5",27:29]
3.409239 3.399253 2.910913

So by this very basic test I find that a fold change of about 3.3 fold is found between my two cell lines. The triplicates is seen to agree fine

Then I use EdgeR to find the same
et <- exactTest(w, pair=c(13,1))
et["Hspa5",]
logFC logCPM PValue FDR
Hspa5 1.80454 12.11638 2.519341e-65 4.262793e-63

FC = exp(1.80454) = 6.077175
CPM = exp(12.11638) = 182842

So after EdgeR has made the comparison between the two cell lines, the CPM value is suddenly 27x higher and (more importantly for me) the fold change is now 2x higher (6x instead of 3x). Is there someone out there that can explain this difference for me? I have read the edgeRUsersGuide and the CPM relevant parts of the Reference Manual but I still have not stumbled across the answer.

I know that the log(CPM) is taking into account the estimated dispersions and the library sizes so it a bit different from the CPM directly ... but 27x difference? and why the 2x difference in the fold change?
Which of the 2 results would you state in a table showing your RNAseq results in a publication?

__________________________

Christian Schrøder Kaas
PhD student at The Technical University of Denmark and Novo Nordisk A/S




	[[alternative HTML version deleted]]

