[BioC] limma differences with and without an intercept
Rory Kirchner [guest]
guest at bioconductor.org
Sun Feb 9 22:44:50 CET 2014
When setting up a model in limma, if I include or exclude an intercept term, I get slightly different results. For example here is a simplified version of my model, which uses a paired design, each patient is represented twice with a measurement when sick and when recovered:
design = model.matrix(~ patient + status, data=samples_norm)
colnames(design)
[1] "(Intercept)" "patientA010" "patientA011"
[4] "patientA019" "patientA024" "patientA030"
[7] "patientA031" "patientA034" "patientA036"
[10] "patientA041" "patientA055" "patientA067"
[13] "patientA073" "patientA080" "statusexacerbation"
The first patient is patient A004. If I run an analysis and compare patient A010 to the baseline (patientA004):
y = DGEList(counts=counts)
y = calcNormFactors(y)
v = voom(y, design)
fit = lmFit(v, design)
fit = eBayes(fit)
a = rownames(topTable(fit, coef=2, n=Inf, p.value=0.05))
length(a)
[1] 70
If I do the same analysis, but use an intercept term I get a slightly different result:
design = model.matrix(~ 0 + patient + status,
data=samples_norm)
y = DGEList(counts=counts)
y = calcNormFactors(y)
v = voom(y, design)
fit = lmFit(v, design)
cm = makeContrasts(
onepatient=patientA010-patientA004,
levels=design)
fit = contrasts.fit(fit, cm)
fit = eBayes(fit)
b = rownames(topTable(fit, n=Inf, p.value=0.05))
length(b)
[1] 72
> length(intersect(a, b))
[1] 69
This does not occur using the pickrell1 dataset. At first I thought this might just be random due to rounding of the p-values, but one of them is not really close to the cutoff so that isn't it. Should I not be expecting these two ways of parameterizing the model to give the same results?
-- output of sessionInfo():
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin13.0.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] splines grid parallel stats graphics grDevices utils
[8] datasets methods base
other attached packages:
[1] Vennerable_3.0 gtools_3.2.1
[3] RBGL_1.38.0 graph_1.40.1
[5] CHBUtils_0.1 tweeDEseqCountData_1.0.10
[7] sva_3.8.0 statmod_1.4.18
[9] rpart_4.1-5 RColorBrewer_1.0-5
[11] mgcv_1.7-28 nlme_3.1-113
[13] DESeq_1.14.0 lattice_0.20-24
[15] locfit_1.5-9.1 corpcor_1.6.6
[17] BiocInstaller_1.12.0 affyPLM_1.38.0
[19] preprocessCore_1.24.0 gcrma_2.34.0
[21] affy_1.40.0 gplots_2.12.1
[23] biomaRt_2.18.0 knitr_1.5
[25] devtools_1.4.1 reshape_0.8.4
[27] plyr_1.8 DESeq2_1.2.10
[29] RcppArmadillo_0.4.000.2 Rcpp_0.11.0
[31] GenomicRanges_1.14.4 XVector_0.2.0
[33] IRanges_1.20.6 vsn_3.30.0
[35] gridExtra_0.9.1 ggplot2_0.9.3.1
[37] HTSFilter_1.2.1 Biobase_2.22.0
[39] BiocGenerics_0.8.0 edgeR_3.4.2
[41] limma_3.18.11 googleVis_0.4.7
[43] xtable_1.7-1 extrafont_0.16
[45] dplyr_0.1.1
loaded via a namespace (and not attached):
[1] affyio_1.30.0 annotate_1.40.0 AnnotationDbi_1.24.0
[4] assertthat_0.1 Biostrings_2.30.1 bitops_1.0-6
[7] caTools_1.16 codetools_0.2-8 colorspace_1.2-4
[10] compiler_3.0.2 DBI_0.2-7 dichromat_2.0-0
[13] digest_0.6.4 evaluate_0.5.1 extrafontdb_1.0
[16] formatR_0.10 gdata_2.13.2 genefilter_1.44.0
[19] geneplotter_1.40.0 gtable_0.1.2 httr_0.2
[22] KernSmooth_2.23-10 labeling_0.2 MASS_7.3-29
[25] Matrix_1.1-2 memoise_0.1 munsell_0.4.2
[28] proto_0.3-10 RCurl_1.95-4.1 reshape2_1.2.2
[31] RJSONIO_1.0-3 RSQLite_0.11.4 Rttf2pt1_1.2
[34] scales_0.2.3 stats4_3.0.2 stringr_0.6.2
[37] survival_2.37-7 tools_3.0.2 whisker_0.3-2
[40] XML_3.98-1.1 zlibbioc_1.8.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list