[BioC] Differential expression of RNA-seq data using limma and voom()

Gordon K Smyth smyth at wehi.EDU.AU
Thu Nov 28 03:01:26 CET 2013

Dear Jon,

No, it is absolutely not ok to input FPKM values to voom.  The results 
will be nonsense.  Same goes for edgeR and DESeq, for the reasons 
explained in the documentation for those packages.

Note that a matrix of FPKM is not a matrix of counts.

It seems to me that you have three options in decreasing order of 

1. Get the actual integer counts from which the FPKM were computed and do 
a proper analysis, for example using voom or edgeR.

2. Get the gene lengths and library sizes used to compute the FPKM and 
convert the FPKM back to counts.

3. If FPKM is really all you have, then convert the values to a log2 scale 
and do an ordinary limma analysis as you would for microarray data, using 
eBayes() with trend=TRUE.  Do not use voom, do not use edgeR, do not use 
DESeq.  (Do not pass go and do not collect $200.)  This isn't 100% ideal, 
but is probably the best analysis available.

The third option is similar to the "limma-trend" analysis described in the 
limma preprint, except that it is applied to the logFPKM instead of 
logCPM.  Statistically this will not perform as well as it would applied 
to the logCPM.

Best wishes

Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.

On Tue, 26 Nov 2013, Jon Brate wrote:

> Hi everyone,
> I have a count matrix of FPKM values and I want to estimate 
> differentially expressed genes between two conditions. First I used 
> DESeq2, but I realized that this is not good for FPKM values.

> I then transformed the counts using voom() in the limma package and then 
> used:
> fit <- lmFit(myVoomData,design)
> fit <- eBayes(fit)
> options(digits=3)
> writefile = topTable(fit,n=Inf,sort="none", p.value=0.01)
> write.csv(writefile, file="file.csv")
> My problem is that all of the 6156 genes are differentially expressed 
> (p-value 0.01). Only a few hundred were differentially expressed using 
> DESe2, but I guess that can't be trusted.
> I am new to this type of analysis, and to R, but is it ok to simply 
> transform the data by voom()? Can I use the transformed data in DESeq2? 
> Any other ways I can use FPKM counts to estimate differentially 
> expressed genes?
> Thank you,
> Jon Bråte
>> sessionInfo()
> R version 3.0.2 (2013-09-25)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
> locale:
> [1] C
> attached base packages:
> [1] grid      parallel  stats     graphics  grDevices utils     datasets  methods   base
> other attached packages:
> [1] limma_3.18.3         cummeRbund_2.4.0     Gviz_1.6.0           rtracklayer_1.22.0   GenomicRanges_1.14.3 XVector_0.2.0
> [7] IRanges_1.20.6       fastcluster_1.1.11   reshape2_1.2.2       ggplot2_0.9.3.1      RSQLite_0.11.4       DBI_0.2-7
> [13] BiocGenerics_0.8.0
> loaded via a namespace (and not attached):
> [1] AnnotationDbi_1.24.0   BSgenome_1.30.0        Biobase_2.22.0         Biostrings_2.30.1      Formula_1.1-1
> [6] GenomicFeatures_1.14.2 Hmisc_3.13-0           MASS_7.3-29            RColorBrewer_1.0-5     RCurl_1.95-4.1
> [11] Rsamtools_1.14.2       XML_3.95-0.2           biomaRt_2.18.0         biovizBase_1.10.4      bitops_1.0-6
> [16] cluster_1.14.4         colorspace_1.2-4       dichromat_2.0-0        digest_0.6.3           gtable_0.1.2
> [21] labeling_0.2           lattice_0.20-24        latticeExtra_0.6-26    munsell_0.4.2          plyr_1.8
> [26] proto_0.3-10           scales_0.2.3           splines_3.0.2          stats4_3.0.2           stringr_0.6.2
> [31] survival_2.37-4        tools_3.0.2            zlibbioc_1.8.0
> ----------------------------------------------------------------
> Jon Bråte
> Microbial Evolution Research Group (MERG)
> Department of Biosciences
> University of Oslo
> P.B. 1066 Blindern
> N-0316, Norway
> Email: jon.brate at ibv.uio.no
> Phone: 922 44 582
> Web: mn.uio.no/ibv/english/people/aca/jonbra/index.html

The information in this email is confidential and intend...{{dropped:5}}

More information about the Bioconductor mailing list