[BioC] Removing batch effects

Tue Aug 23 11:39:16 CEST 2005

Hello,

If the batch effect ist strong, I think normalization does more harm than good... . You could first simply look at the extend of the batch effect by plotting un-normalized intensity distributions for each batch. If you've a visual difference between batches, normalisation will probalby not do the right thing.

I suggest you use a linear model (possibly via limma) and include batch as a factor. E.g. if you've a treatment factor you're interested in you'd do

lm(intensity ~ treatment + batch)

The batch effect could also be considered as a random effect - you'd use the lme package (I think limma offers similar functionality):

library(nlme)
lme(ntensity ~ treatment, random = ~1|batch)

You should compare the results (number of genes with sigificant treatment effect) from the linear model when intensities are normalized all together (all batches) and when batches are normalized separately. I've had a similar problem some time ago, and I found that the batch factor in the lm takes care of the batch effect so that batches could be normalized separately.

You should also evaluate the proportion of genes with significant interacttions between treatment and batch (lm(intensity ~ treatment + batch + treatment:batch)). In my analysis the numbers of genes with significant interactions was < 2% the number of genes with significant main effects, so that the interaction could "savely" be ignored (3 genes). However, if there's a large number of genes with significant interactions the anaysis becomes difficult (i.e. the trend of the treatment effect behaves differently between days).

	hope this helps,
	+kind regards,

	Arne

> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch]On Behalf Of Aedin
> Sent: Monday, August 22, 2005 22:47
> To: bioconductor at stat.math.ethz.ch
> Subject: [BioC] Removing batch effects
> 
> 
> Dear BioC
> 
> I have a set of Affymetrix chips which have a clear "process 
> day" batch
> effect. This effect is only partially removed by RMA, gcRMA, vsn or
> invariantset "expresso" normalisation. 
> 
> What would you recommend:
> 
> 1. Normalize each small batch, merge the batches and 
> re-normalize. If so
> is one of the above normalisation options better than the options? 
> 
> 2. Model the batch effect using fitPLM.  If so how to I extract the
> "corrected" exprs?
> 
> 3. Any other suggestions? 
> 
> 
> Thanks a million for your help in this,
> 
> Aedin  Culhane
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>