[BioC] continued dye effects, after normalization

Wed Jan 10 17:42:23 CET 2007

Hi all,

I've been analyzing a spotted array experiment that used a common reference 
with a 2X2 factorial design. There were no technical dye swaps, but half of 
the 6 replicates in each group had the ref in Cy3 and half had the ref in 
Cy5. Now that Jim has modified plotPCA to accept matrices, I was checking 
for any unsuspected groupings that might indicate block effects. To my 
surprise, the arrays were still grouping based on the reference channel, 
even after inverting the M-values so that the reference channel was always 
in the denominator! Attached is a figure with 2 PCA plots, and hopefully it 
is small enough to make it through; the code that created them is 
below.  Has anyone else noticed this, and what have you done about it? I 
went back and checked some other experiments that used a common reference, 
and they also mostly showed a continued dye grouping. A between-array scale 
normalization, either on the regular M-values or on inverted M-values, 
failed to remove the dye effect as well. I didn't try other normalizations, 
but instead included 'ref dye' as a blocking variable. The consensus 
correlation from duplicateCorrelation was 0.154, which when included in the 
lmFit model increase the number of genes found significantly different.

I have been working with a physics professor and his student who have 
developed a different data mining algorithm, which shows these dye effects 
even more strongly than PCA. They are suggesting another normalization is 
needed to remove the ref dye effect, and they want to normalize the ref dye 
groups separately. Doing a separate normalization doesn't seem like a good 
idea to me, and I wanted to get other opinions on the dye effect, my 
approach, and other normalization options.

Thanks!
Jenny

code:

RG <- read.maimages(targetsb$FileName,path="D:/MA Jenny",
                 source="genepix.median",names=targetsb$Label,wt.fun=f)

RG.half <- backgroundCorrect(RG,method="half")

MA.half <- normalizeWithinArrays(RG.half)

temp <- MA.half
temp$M[,targetsb$Cy3=="ref"] <- -1 * temp$M[,targetsb$Cy3=="ref"]

layout(matrix(1:2,2,1))
plotPCA(MA.half$M,groups=rep(c(1,2,1,2,1,2,1,2),each=3),groupnames=c("ref 
G","ref R"))
         # PC1 divides the arrays by which channel the ref was in
plotPCA(temp$M,groups=rep(c(1,2,1,2,1,2,1,2),each=3),groupnames=c("ref 
G","ref R"))
         # after inverting the M-values for half the arrays, PC1 divides 
the arrays by one of the treatments, but
         # the dye effect still shows up in PC2

MA.half.scale <- normalizeBetweenArrays(MA.half,method="scale")

design <- modelMatrix(targetsb,ref="ref")

block <- rep(c(1,2,1,2,1,2,1,2),each=3)

corfit <- duplicateCorrelation(MA.half.scale[RG$genes$Status=="cDNA",], 
design, ndups=1, block=block)

corfit$consensus
     #[1] 0.1537080

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu