[BioC] Analysis tips for two-color oligo arrays with common reference and dye swap

Thu Aug 14 09:34:22 CEST 2008

Dear all
I would appreciate some feedback regarding this analysis I have done
on data from a microarray experiment conducted using two-color oligo
arrays with common reference and dye swap. As an example, consider 4
individuals belonging to two different physiological states (2 of
each). I would like to find the differentially expressed genes between
the 2 states.
The targets file is as follows:
Name		Cy3		Cy5
Array1		State1	Ref
Array2		Ref		State1
Array3		State1	Ref
Array4		Ref		State1
Array5		State2	Ref
Array6		Ref		State2
Array7		State2	Ref
Array8		Ref		State2

Array1 and Array2 are technical replicates but dye swapped. Similarly
the other pairs of slides are dye swapped technical replicates.

The design matrix would be:
       State1 State2
Array1     -1      0
Array2      1      0
Array3     -1      0
Array4      1      0
Array5      0     -1
Array6      0      1
Array7      0     -1
Array8      0      1

Background correction was done using 'normexp+offset' method. As
spatial variation and intensity-based trends were observed in
diagnostic plots, within array normalization was done using print tip
loess method. Between array normalization was done using quantile
method. Spots flagged -75 and below have been weighted zero.
Further analysis was done as mentioned in section 8.2 'Technical
Replication' and section 8.4 'Two Groups: Common Reference' of LIMMA
user's guide (Smyth et al., 2007):

biolrep <- (1,1,2,2,3,3,4,4)
corfit <- duplicateCorrelation(MA, design, ndups = 1, block = biolrep)
fit <- lmFit(MA, design, block = biolrep, cor = corfit$consensus,
weights=MA$weights)
cont.matrix <- makeContrasts(State1vsState2=State1-State2, levels=design)
fit <- contrasts.fit(fit, cont.matrix)
fit <- eBayes(fit)
topTable(fit, number = 30, sort.by = "M", adjust.method = "BH")

I found the corfit$consensus value to be 0.09 and not the expected
negative value for dye swaps. From an earlier post in this mailing
list, I understand that this value would be problematic only if much
larger then 0.
I have also read in a post from yesterday where Gordon replied that
LIMMA isn't smart enough to handle the dye-swaps and the blocking at
the same time. So, I would like to know if my analysis with the above
code is reliable and complete?
Looking forward to some tips on the analysis steps.
Thanks
Arun
PhD student, Wageningen University
The Netherlands