[BioC] Is GCRMA influenced by lexicographical order?

Mon Nov 8 10:19:35 CET 2010

Dear all,

I've made some experiment about GCRMA using bg.adjust.gcrma() contained 
in the gcrma package.

I take two CEL files, A.cel and B.cel (CASE I). Both are from arrays of 
the same type, for instance Mouse Gene 430 2.0 Arrays. In addition, I 
rename both files in a way, which changes the lexicographical order, 
say, I rename A.cel to BA.cel and I rename B.cel to AB.cel (CASE II).

Using bg.adjust.gcrma() and exprs() to obtain the expression values, for 
some probe sets I get different values for A.cel (in CASE I) than for 
BA.cel (in CASE II), although both files contain the same data. Of 
course, for B.cel and AB.cel the same phenomenon becomes obvious.

How can this be possible? To me, it seems as if GCRMA normalizes the 
data with respect to the lexicographical order, but as far as I know, 
GCRMA treats each array independently.

This effect is not reproducible using call.exprs() with argument 
algorithm="rma" or "mas5". But is reproducible for other arrays like 
HGU95A, for example.

Please, can anybody try to explain me, if this effect is a bug or a 
feature (and why)?

Best wishes,
Markus

#################################################################

sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:
[1] hgu133plus2probe_2.7.0 AnnotationDbi_1.12.0   hgu133plus2cdf_2.7.0 
[4] simpleaffy_2.26.0      genefilter_1.32.0      gcrma_2.22.0         
[7] affy_1.28.0            Biobase_2.10.0       

loaded via a namespace (and not attached):
 [1] Biostrings_2.18.0     DBI_0.2-5             IRanges_1.8.2       
 [4] RSQLite_0.9-2         affyio_1.18.0         annotate_1.28.0     
 [7] preprocessCore_1.12.0 splines_2.12.0        survival_2.35-8     
[10] tools_2.12.0          xtable_1.5-6        

Also tried on other machine with similar packages

sessionInfo()
R version 2.11.1 Patched (2010-09-16 r52943)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=de_DE.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=de_DE.UTF-8
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] simpleaffy_2.24.0 genefilter_1.30.0 gcrma_2.20.0      affy_1.26.1
[5] Biobase_2.8.0

loaded via a namespace (and not attached):
 [1] affyio_1.16.0         annotate_1.26.1       AnnotationDbi_1.10.2
 [4] Biostrings_2.16.9     DBI_0.2-5             IRanges_1.6.17
 [7] preprocessCore_1.10.0 RSQLite_0.9-2         splines_2.11.1
[10] survival_2.35-8       tools_2.11.1          xtable_1.5-6

#################################################################

Code example

setwd("PATH TO CASE I DATA") #the directory containing A.cel and B.cel
CEL <- ReadAffy()
bag.1 <- bg.adjust.gcrma(CEL)
ebag.1 <- exprs(bag.1)

setwd("PATH TO CASE II DATA") #the directory containing BA.cel and AB.cel
CEL <- ReadAffy()
bag.2 <- bg.adjust.gcrma(CEL)
ebag.2 <- exprs(bag.2)

par(mfrow=c(2,1))
# differences should be zero
plot(ebag.1[,1]-ebag.2[,2])
plot(ebag.1[,2]-ebag.2[,1])