[BioC] Interpreting DESeq2 results

Michael Muratet mmuratet at hudsonalpha.org
Thu Mar 28 17:00:17 CET 2013


I have an experiment:

> design(dse)
~ factor1 + factor2 + factor3

where factor1 has two levels, factor2 has three levels and factor3 has three levels. I extract a gene of interest from the results for each term (I've changed the indices to reflect the condition):

> lapply(resultsNames(dse),function(u) results(dse,u)["gene_A",])
        baseMean log2FoldChange        pvalue           FDR
gene_A 1596.548       10.77485 3.309439e-216 7.025442e-216
        baseMean log2FoldChange    pvalue       FDR
gene_A 1596.548      0.3386776 0.1307309 0.3587438
        baseMean log2FoldChange    pvalue       FDR
gene_A 1596.548     -0.6882543 0.0613569 0.1007896
        baseMean log2FoldChange   pvalue       FDR
gene_A 1596.548      0.2393368 0.513216 0.6589575
        baseMean log2FoldChange    pvalue       FDR
gene_A 1596.548      0.1584153 0.6423634 0.8503163
        baseMean log2FoldChange       pvalue         FDR
gene_A 1596.548      -1.627898 1.823141e-06 0.001409384

I want to be sure I understand the output format. Is it true that the coefficients (the vector beta) from the fit are the baseMean value scaled by the log2FoldChange? Is the true intercept value 1596.548*2^10.77485=2797274.13? 

mcols() tells me that the baseMean term is calculated over "all rows". The baseMean is different for different genes although it is the same for each gene across all the conditions, I'm not seeing how the rows are selected.



Michael Muratet, Ph.D.
Senior Scientist
HudsonAlpha Institute for Biotechnology
mmuratet at hudsonalpha.org
(256) 327-0473 (p)
(256) 327-0966 (f)

Room 4005
601 Genome Way
Huntsville, Alabama 35806

More information about the Bioconductor mailing list