[BioC] batch effects and VSN

Hans-Ulrich Klein h.klein at uni-muenster.de
Wed Sep 26 17:43:35 CEST 2007


Hello List,

I am analyzing some arrays with strong "batch effects". The source of 
the variation is unknown. The biologists and I found out that some of 
the systematic variation is related to some processing steps in the 
laboratory (ChIP-experiment).

My first general question is how do you deal with batch effects? I found 
not much about it in the archive.

I proceeded as follows:

I used limma for computing oligos with differential intensities between 
two classes. Adding a factor for batch effects is easy and reduces the 
R^2 of the gene-wise models in my case noticeably.


I am more worried about the normalization. I like VSN and used it here, 
too. The arrays are single color oligonucleotide arrays (not 
commercial). The VSN vignette states that VSN is not capable of 
calibrating arrays from different batches.

Using the notation of the vignette, the vsn model is:

y_ki = a_ki + b_i b_k c_ki

y_ki is the measured intensity of gene k on array i. c_ki is the true 
mRNA abundance. The oligo-specific factor b_k is not estimated. Instead 
the normalized intensities are given in probe-specific units. However, 
b_k will perhaps be different for different batches. Could one 
substitute b_k by b_kb, which is a oligo-specific factor for oligo k in 
batch b? b_kb has to be estimated from data.

I am not sure, whether it is practical. The number of model parameters 
increases a lot. So, I wonder if someone has tried this (or something 
similar) before?

Any comments are welcome. Also hints to other normalization procedures 
which may be suitable. Currently, I am using (standard) VSN. It seems to 
work (stable variance, iteration converges), but the batch effects 
remain. And probably, the LTS regression chooses probes for estimation, 
which have small batch effects (and not necessarily an equal amount of 
hybridized DNA between my to classes of interest).

Regards,
Hans-Ulrich

-- 
Hans-Ulrich Klein
Westfälische Wilhelms-Universität Münster
Department of Medical Informatics and Biomathematics
Domagkstr. 9, 48149 Münster



More information about the Bioconductor mailing list