[BioC] TMM and calcNormFactors: Normalization in baySeq to match edgeR and DESeq
Smith, Hilary A
hilary.smith at gatech.edu
Thu Nov 17 16:07:31 CET 2011
Hello,
I'm working on a couple analyses (currently pairwise) for 3'-DGE. Using baySeq, edgeR, and DESeq are yielding different answers; specifically DESeq and baySeq find different subsets of the genes found by edgeR. In trying to isolate the discrepancy, I've been trying to make items like normalization procedures similar to see if that improves congruency, or if the differences merely stem from how the pairwise tests are run and use of bayesian vs. exact-type statistics. I saw that baySeq's function "getLibsizes" can use the edgeR implementation of TMM, but when I try to do this I get an error message about a quantile argument not being used. This error appears whether or not I specify a quantile, and I'm further confused because the edgeR program itself does not require specifying quantiles for its TMM-based calcNormFactors. EdgeR seems to run fine so I think the problem is in the implementation of baySeq; perhaps I'm misunderstanding/coding something? Any help is greatly appreciated; commands excerpted from an R session are below.
> library(baySeq)
Attaching package: 'baySeq'
The following object(s) are masked from 'package:base':
rbind
> library(snow)
> cl = makeCluster(4, "SOCK")
> library(edgeR)
> simData = read.delim(file="2011.11.03counts.txt", header=TRUE)
> rownames(simData)=simData$CompID
> simData=simData[,-1]
> simData=as.matrix(simData)
> head(simData)
X1E_F X1E_R X2E_F X2E_R X3E_F X3E_R X1P_F X1P_R X2P_F X2P_R X3P_F
comp0 1065 1159 1207 1572 1477 1817 1841 605 1915 1113 1645
comp1 544 534 341 675 333 739 690 236 502 451 571
comp10 30423 37677 28044 54466 23961 58271 53852 34712 59300 40312 44575
comp100 1060 1065 999 1332 918 1620 1697 658 1117 861 1336
comp1000 130 157 229 266 141 247 263 135 182 188 168
comp10000 35 14 15 37 10 47 28 17 22 21 12
X3P_R
comp0 1732
comp1 799
comp10 51243
comp100 1370
comp1000 244
comp10000 64
> replicates = c("F", "R", "F", "R", "F", "R", "F", "R", "F", "R", "F", "R")
> groups = list(NDE = c(1,1,1,1,1,1,1,1,1,1,1,1), DE = c(1,2,1,2,1,2,1,2,1,2,1,2))
> cD = new("countData", data = simData, replicates = replicates, groups=groups)
> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR")
Calculating library sizes from column totals.
Error in calcNormFactors(d, quantile = quantile, ...) :
unused argument(s) (quantile = quantile)
> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="TMM")
Error in match.arg(estimationType) :
'arg' should be one of "quantile", "total", "edgeR"
> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR", quantile=0.75)
Calculating library sizes from column totals.
Error in calcNormFactors(d, quantile = quantile, ...) :
unused argument(s) (quantile = quantile)
> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, quantile=0.75, estimationType="edgeR")
Calculating library sizes from column totals.
Error in calcNormFactors(d, quantile = quantile, ...) :
unused argument(s) (quantile = quantile)
> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType=c("edgeR", quantile=0.75))
Error in match.arg(estimationType) : 'arg' must be of length 1
> calcNormFactors(cD)
Error in calcNormFactors(cD) :
calcNormFactors() only operates on 'matrix' and 'DGEList' objects
> calcNormFactors(simData)
X1E_F X1E_R X2E_F X2E_R X3E_F X3E_R X1P_F X1P_R
1.0353157 0.9529524 0.9868063 1.1068479 1.0054938 1.0218195 0.9600905 0.8287707
X2P_F X2P_R X3P_F X3P_R
1.0550414 0.8955669 1.0869486 1.1052472
> cD at libsizes = getLibsizes(cD, data=simData, replicates=replicates, subset=NULL, estimationType="edgeR")
Calculating library sizes from column totals.
Error in calcNormFactors(d, quantile = quantile, ...) :
unused argument(s) (quantile = quantile)
> cD at libsizes = getLibsizes(data=simData, replicates=replicates, subset=NULL, estimationType="edgeR")
Calculating library sizes from column totals.
Error in calcNormFactors(d, quantile = quantile, ...) :
unused argument(s) (quantile = quantile)
More information about the Bioconductor
mailing list