[BioC] sequential removeBatchEffect()?

Thu Jun 30 15:25:24 CEST 2011

HI Gordon,

No, Sample_Group isn't confounded with BeadChip. Our microarray unit 
is very good about asking customers about the treatment groups and 
randomizing them on arrays. I use GenomeStudio to output summarized 
beadtype values (no bg or norm), and GenomeStudio always re-organizes 
the by Sample_Group. I've learned to double-check that the order in 
my targets file is the same as the order coming out of GenomeStudio!

Jenny

At 06:08 PM 6/29/2011, Gordon K Smyth wrote:
>Hi Jenny,
>
>Perhaps I'm mis-understanding your setup, but isn't Sample_Group 
>confounded with BeadChip?  If the samples are entered in BeadChip 
>order, then the first six would be BeadChip 1 (and also Cont.8), the 
>second six would be BeadChip 2 (and also DHT.8), etc.  Or have the 
>samples been reordered by Sample_Group?
>
>Best wishes
>Gordon
>
>---------------------------------------------
>Professor Gordon K Smyth,
>Bioinformatics Division,
>Walter and Eliza Hall Institute of Medical Research,
>1G Royal Parade, Parkville, Vic 3052, Australia.
>smyth at wehi.edu.au
>http://www.wehi.edu.au
>http://www.statsci.org/smyth
>
>On Wed, 29 Jun 2011, Jenny Drnevich wrote:
>
>>Hi Gordon and everyone,
>>
>>I was wondering what is the best way to remove two different batch 
>>effects from a set of sample values for plotting heatmaps? I have 
>>data from Illumina's MouseWG6 arrays, 24 samples on 4 BeadChips. 
>>There is the effect of BeadChip, which I'm accounting for in the 
>>model using duplicateCorrelation(), and also population batch 
>>effect, which I have as a term in the design matrix. I'd like to 
>>remove both of these effects from the individual sample data before 
>>making heatmaps of significant genes. Should I use 
>>removeBatchEffect() twice, first to remove the BeadChip effect 
>>(with population in the design) and then remove the population 
>>effect? Here's an example of my analysis code and what I'm thinking 
>>of doing to remove the two effect. Please let me know if this is 
>>alright, or if I should do something different:
>>
>>>design <- model.matrix(~0+ factor(targets$Sample_Group))
>>>colnames(design) <- levels(factor(targets$Sample_Group))
>>>design <- cbind(design,pop=as.numeric(targets$pop==1))
>>>design
>>   Cont.24 Cont.8 DHT.24 DHT.8 pop
>>1        0      1      0     0   1
>>2        0      1      0     0   0
>>3        0      1      0     0   0
>>4        0      1      0     0   1
>>5        0      1      0     0   1
>>6        0      1      0     0   0
>>7        0      0      0     1   1
>>8        0      0      0     1   0
>>9        0      0      0     1   0
>>10       0      0      0     1   1
>>11       0      0      0     1   0
>>12       0      0      0     1   1
>>13       1      0      0     0   1
>>14       1      0      0     0   0
>>15       1      0      0     0   1
>>16       1      0      0     0   0
>>17       1      0      0     0   0
>>18       1      0      0     0   1
>>19       0      0      1     0   1
>>20       0      0      1     0   0
>>21       0      0      1     0   1
>>22       0      0      1     0   0
>>23       0      0      1     0   0
>>24       0      0      1     0   1
>>
>>>duplicateCorrelation(BSlimma.neqc, design,
>>block=as.numeric(factor(targets$Sentrix_ID)))$consensus
>>[1]  0.1528686
>>
>>>fit.neqc <- lmFit(BSlimma.neqc, design,
>>block=as.numeric(factor(targets$Sentrix_ID)), correlation=0.1528686)
>>
>>>cont.matrix <- makeContrasts(DHT.8_v_Cont.8 = DHT.8 - Cont.8,
>>DHT.24_v_Cont.24 = DHT.24 - Cont.24,
>>+                             Cont.24_v_Cont.8 = Cont.24 - Cont.8, 
>>DHT.24_v_DHT.8 = DHT.24 - DHT.8,
>>+                             interact = (DHT.24 - Cont.24) - 
>>(DHT.8 - Cont.8),levels=design)
>>
>>>fit.neqc2 <- eBayes(contrasts.fit(fit.neqc,cont.matrix))
>>
>># This is what I _think_ I should do to remove both batch effects 
>>from the sample data for heatmaps:
>>
>>>noChip.values <- removeBatchEffect(BSlimm.neqc$E,
>>batch=as.numeric(factor(targets$Sentrix_ID)), design=design)
>>
>>>noChip.noPop.values <- removeBatchEffect(noChip.values, batch=targets$pop,
>>design = design[,1:4] )
>>
>>Then I can use noChip.noPop.values for heatmaps?
>>
>>Thanks,
>>Jenny
>>
>>>sessionInfo()
>>R version 2.13.0 (2011-04-13)
>>Platform: x86_64-pc-mingw32/x64 (64-bit)
>>
>>locale:
>>[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
>>States.1252    LC_MONETARY=English_United States.1252
>>[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252
>>
>>attached base packages:
>>[1] splines   grid      stats     graphics  grDevices 
>>datasets  utils methods   base
>>
>>other attached packages:
>>[1] statmod_1.4.9         beadarray_2.2.0       limma_3.8.2 
>>WGCNA_1.00 Hmisc_3.8-3
>>[6] survival_2.36-5       qvalue_1.26.0         flashClust_1.01 
>>dynamicTreeCut_1.21   impute_1.26.0
>>[11] made4_1.26.0          scatterplot3d_0.3-33  gplots_2.8.0 
>>caTools_1.11 bitops_1.0-4.1
>>[16] gdata_2.8.1           gtools_2.6.2          RColorBrewer_1.0-2 
>>ade4_1.4-17           affyPLM_1.28.5
>>[21] preprocessCore_1.14.0 
>>gcrma_2.24.1          affycoretools_1.24.0 KEGG.db_2.5.0 GO.db_2.5.0
>>[26] 
>>RSQLite_0.9-4         DBI_0.2-5             AnnotationDbi_1.14.1 
>>affy_1.30.0           Biobase_2.12.1
>>[31] RWinEdt_1.8-2
>>
>>loaded via a namespace (and not attached):
>>[1] 
>>affyio_1.20.0     annaffy_1.24.0    annotate_1.30.0   biomaRt_2.8.1 
>>Biostrings_2.20.1 Category_2.18.0   cluster_1.13.3
>>[8] genefilter_1.34.0 
>>GOstats_2.18.0    graph_1.30.0      GSEABase_1.14.0 
>>IRanges_1.10.4   lattice_0.19-23   RBGL_1.28.0
>>[15] 
>>RCurl_1.6-6.1     tcltk_2.13.0      tools_2.13.0      XML_3.4-0.2 xtable_1.5-6
>>
>>
>
>______________________________________________________________________
>The information in this email is confidential and inten...{{dropped:6}}