[BioC] EdgeR Design matrix not of full rank. The following coefficients not estimable erroR
Eugene Bolotin [guest]
guest at bioconductor.org
Sat Dec 21 01:49:43 CET 2013
Hi I have the following samples:
batch
[1] 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 2055 1802 1802 2055
[16] 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055 1802 1802
[31] 1157 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 1802 2055 2055
[46] 2055 2055 2055 2055 2055 2055 2055 2055 2055 2055
Levels: 1157 1802 2055
treatment
[1] TCGA-BR-6452 TCGA-BR-6453 tumor TCGA-BR-6454 tumor
[6] TCGA-BR-6455 TCGA-BR-6456 TCGA-BR-6457 tumor TCGA-BR-6458
[11] tumor TCGA-BR-6563 TCGA-BR-6565 TCGA-BR-6566 TCGA-BR-7196
[16] TCGA-BR-7703 tumor TCGA-BR-7704 tumor TCGA-BR-7707
[21] TCGA-BR-7715 tumor TCGA-BR-7716 tumor TCGA-BR-7717
[26] tumor TCGA-BR-7723 TCGA-CD-5804 TCGA-CG-4437 TCGA-CG-4441
[31] TCGA-CG-4476 TCGA-CG-5716 TCGA-D7-6518 TCGA-D7-6519 TCGA-D7-6520
[36] TCGA-D7-6521 TCGA-D7-6522 TCGA-D7-6524 TCGA-D7-6525 TCGA-D7-6526
[41] TCGA-D7-6527 TCGA-D7-6528 TCGA-F1-6177 TCGA-F1-6875 TCGA-FP-7735
[46] tumor TCGA-FP-7829 tumor TCGA-HF-7131 TCGA-HF-7132
[51] TCGA-HF-7133 TCGA-HF-7134 TCGA-HF-7136 TCGA-IN-7806 tumor
44 Levels: TCGA-BR-6452 TCGA-BR-6453 TCGA-BR-6454 TCGA-BR-6455 ... tumor
I want to compare each sample from TCGA_X, to average mutant background, I know it is possible, because I was able to do it using standard commands.
However, when I try to adjust for batch effects as follows:
design=model.matrix(~batch+treatment)
names(data.frame(design))
group=treatment
y=readDGE(files, path=wd, columns=c(1,2), group=group)
#names(data.frame(design))
design=model.matrix(~0+batch+treatment)
names(data.frame(design))
#rownames(design)=colnames(y)
design
> y = estimateGLMCommonDisp(y, design, verbose=TRUE)
Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, :
Design matrix not of full rank. The following coefficients not estimable:
treatmentTCGA-CG-4476
as far as i can tell it is because the batch 1157 contains a normal sample but does not contain any tumor samples.
Is there a way around that?
Thanks,
Eugene
-- output of sessionInfo():
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] edgeR_3.4.2 limma_3.18.6
loaded via a namespace (and not attached):
[1] tools_3.0.2
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list