Hi,

I have a question about concerning the number of differentially expressed
probes after batch combination, using ComBat from 'sva'.

I have 2 data sets: one containing around 250 samples that correspond to
around 50 groups, another one containing 10 samples corresponding to 2
groups (let me call them Batch2_Group1, Batch2_Group2). One of the 2 group
labels in the second batch (Batch2_Group2) also exists in the first batch,
so there is no confounding situation here.

Before batch correction the 2 data sets cluster by batch, not by group.

I used ComBat from the R/Bioconductor package 'sva' to correct for this,
using a model matrix to accommodate the overlapping groups between the 2
batches and setting par.prior=TRUE, i.e. using parametric adjustment.
After the batch correction the samples cluster perfectly by group, not by
batch any longer.

I do notice, however, that the number of differentially expressed probes
between Batch2_Group1 and Batch2_Group2 changes dramatically with data
combination. Within Batch2 alone I have around 1000 differentially
expressed probes, around 50% up- and down-regulated each. After data
combination I have around 3000 differentially expressed probes, ~2000 up
and ~1000 down in the group comparison. (I use 'limma' for differential
analysis).

It seems that ComBat pulled the groups Batch2_Group1 and Batch2_Group2
further apart from each other. The group that did not have a group label
match in Batch1 is now much more up-regulated.

Is there a way to adjust the data combination so I can keep the number of
differentially expressed probes similar to what it was before?

Thank you,
Michaela

	[[alternative HTML version deleted]]

