[BioC] ComBat:covariates or additional batch
Hedi Peterson [guest]
guest at bioconductor.org
Fri Jan 25 12:42:00 CET 2013
I have a question regarding Batch and Covariates in ComBat. I have an Illumina expression dataset of control versus treatment, at 3 different time points (day3,5,7) and 4 biological replicates each. However RNA from different experimental groups (replicates) was extracted at different time points and by default samples cluster together based on the RNA extraction date therefore I have a strong batch effect.
Should I use RNA extraction date as Batch and both the biological replicate groups and Time_x_Treatment as covariates (like the sample file is shown below) or apply it in any other way (using extraction date as first batch and then re-run with replicate groups as second batch)?
Second question, is it correct to subgroup the covariate as Time_x_Treatment or should I just have control vs treatment (and not specify the day factor)?
Array Sample Batch Covariate Covariate2
D3C4 day3control_repl4 1 day3control 1
D3C3 day3control_repl3 2 day3control 3
D3C2 day3control_repl2 3 day3control 4
D3C1 day3control_repl1 3 day3control 2
D5C4 day5control_repl4 2 day5control 3
D5C3 day5control_repl3 1 day5control 1
D5C2 day5control_repl2 1 day5control 4
D5C1 day5control_repl1 4 day5control 2
D7C4 day7control_repl4 5 day7control 4
D7C3 day7control_repl3 4 day7control 3
D7C2 day7control_repl2 2 day7control 2
D7C1 day7control_repl1 1 day7control 1
D3T4 day3treatment_repl4 1 day3treatment 1
D3T3 day3treatment_repl3 4 day3treatment 3
D3T2 day3treatment_repl2 1 day3treatment 4
D3T1 day3treatment_repl1 6 day3treatment 2
D5T4 day5treatment_repl4 5 day5treatment 3
D5T3 day5treatment_repl3 4 day5treatment 1
D5T2 day5treatment_repl2 6 day5treatment 4
D5T1 day5treatment_repl1 6 day5treatment 2
D7T4 day7treatment_repl4 2 day7treatment 4
D7T3 day7treatment_repl3 4 day7treatment 3
D7T2 day7treatment_repl2 5 day7treatment 2
D7T1 day7treatment_repl1 2 day7treatment 1
-- output of sessionInfo():
R version 2.15.2 (2012-10-26)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] splines stats4 graphics grDevices utils datasets grid stats methods base
other attached packages:
[1] mgcv_1.7-22 corpcor_1.6.4 RColorBrewer_1.0-5 panp_1.28.1 hwriter_1.3 R2HTML_2.2 reshape_0.8.4 plyr_1.7.1 gProfileR_0.2 fpc_2.1-5
[11] flexmix_2.3-8 multcomp_1.2-14 survival_2.36-14 mvtnorm_0.9-9993 modeltools_0.2-19 lattice_0.20-10 mclust_4.0 MASS_7.3-22 cluster_1.14.3 preprocessCore_1.20.0
[21] affy_1.36.0 Biobase_2.18.0 BiocGenerics_0.4.0 limma_3.14.1 pheatmap_0.7.4 gridExtra_0.9.1 ggplot2_0.9.2.1
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list