[BioC] finding a very large number of false positives using edgeR

Blum, Charles CBlum at mednet.ucla.edu
Thu Jan 16 00:07:49 CET 2014


I am running edgeR on 6 RNAseq samples that were generated using the exact same protocol but are from different Illumina project runs.
In theory, no genes should be differentially expressed. Nevertheless, edgeR identifies almost 7,000 genes as DE at a FDR rate of 0.1. This is very puzzling.

I ran edgeR using the classic approach (exactTest)  and the glm approach.

To get an idea of sequencing depth:
Sample:                                                    Project1_sample1  Project1_sample2      Project1_sample3    Project2_sample1    Project2_sample2    Project2_sample3
Total unique annotated read counts:             41,440,190               26,429,859                  29,655,944                  25,423,167               30,914,059                   35,41,714

Could it be due to the variability in sequencing depth between projects?
Could there anything else in the data or analysis that could violate any assumptions made by edgeR?
Is there any known problems with the newest version of edgeR?

> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] GenomicFeatures_1.14.2 AnnotationDbi_1.24.0   GenomicRanges_1.14.4   XVector_0.2.0
 [5] IRanges_1.20.6         biomaRt_2.18.0         edgeR_3.4.2            limma_3.18.7
 [9] DESeq_1.14.0           lattice_0.20-24        locfit_1.5-9.1         Biobase_2.22.0
[13] BiocGenerics_0.8.0     gplots_2.12.1          MASS_7.3-29            heatmap.plus_1.3

loaded via a namespace (and not attached):
 [1] annotate_1.40.0    Biostrings_2.30.1  bitops_1.0-6       BSgenome_1.30.0    caTools_1.16
 [6] DBI_0.2-7          gdata_2.13.2       genefilter_1.44.0  geneplotter_1.40.0 grid_3.0.2
[11] gtools_3.1.1       KernSmooth_2.23-10 RColorBrewer_1.0-5 RCurl_1.95-4.1     Rsamtools_1.14.2
[16] RSQLite_0.11.4     rtracklayer_1.22.0 stats4_3.0.2       survival_2.37-4    tools_3.0.2
[21] XML_3.95-0.2       xtable_1.7-1       zlibbioc_1.8.0

> packageDescription('edgeR')$Maintainer
[1] "Mark Robinson <mark.robinson at imls.uzh.ch>, Davis McCarthy\n<dmccarthy at wehi.edu.au>, Yunshun Chen <yuchen at wehi.edu.au>,\nGordon Smyth <smyth at wehi.edu.au>"



IMPORTANT WARNING: This email (and any attachments) is o...{{dropped:9}}

More information about the Bioconductor mailing list