[BioC] preprocessing for tiling arrays (preprocessCore package)

Ann Hess hess at stat.colostate.edu
Mon Jul 21 19:13:33 CEST 2008


I would like to perform RMA style preprocessing for tiling arrays.  Based 
on previous discussion (thanks Ben and Jim!), it seems that the functions 
rma.background.correct() and normalize.quantiles() from the 
preprocessCore package are the easiest way to accomplish this.

First of all, I am having trouble getting the functions to work.  The 
rma.background.correct() function seems to be returning the original 
values.  The normalize.quantiles() function does something to the data, 
but the medians are not equal across arrays (which I would have expected 
from quantiles normalization).

Secondly, I wanted to verify that to most closely match the RMA algorithm, 
I should (1) restrict to PM values (2) background correct, (3) normalize, 
then (4) log2 transform. Is this correct?

The code (using Canine2 arrays for quicker experimentation) and 
sessionInfo are below.

Ann

**************************************

> library(affy)
> library(preprocessCore)
> AllArrays<-ReadAffy()
> PM<-pm(AllArrays)
> dim(PM)
[1] 473162     15
> PM[1:10,1:5]
        RB1.CEL RB10.CEL RB11.CEL RB12.CEL RB13.CEL
571209     134       52       56       67      309
571210     151       63       78       85      587
571211     388      116      134      129     1396
571212     215       56       68       67      530
571213      83       41       42       47      131
571214      83       42       41       50      148
571215     107       43       42       63      194
571216     216       48       69       83      620
571217     254       61       80       79      876
571218     131       42       61       47      246
> BG<-rma.background.correct(PM)
> dim(BG)
[1] 473162     15
> BG[1:10,1:5]
        RB1.CEL RB10.CEL RB11.CEL RB12.CEL RB13.CEL
571209     134       52       56       67      309
571210     151       63       78       85      587
571211     388      116      134      129     1396
571212     215       56       68       67      530
571213      83       41       42       47      131
571214      83       42       41       50      148
571215     107       43       42       63      194
571216     216       48       69       83      620
571217     254       61       80       79      876
571218     131       42       61       47      246

> Norm<-normalize.quantiles(BG)
> dim(Norm)
[1] 473162     15
> Norm[1:10,1:5]
           [,1]      [,2]      [,3]      [,4]     [,5]
  [1,] 128.4667 143.40000 114.06667  332.7333 230.0000
  [2,] 145.0000 202.66667 188.80000  560.2000 413.3333
  [3,] 383.6000 480.93333 368.73333 1110.5000 936.6333
  [4,] 208.6000 164.80000 155.20000  332.7333 375.8667
  [5,]  79.6000  84.40000  65.26667  103.0667 108.0667
  [6,]  79.6000  89.73333  62.00000  131.9333 120.0000
  [7,] 102.3333  95.26667  65.26667  283.1333 151.9333
  [8,] 209.4000 122.13333 158.80000  534.5333 435.6000
  [9,] 247.5333 191.73333 195.80000  482.5333 602.5667
[10,] 125.7333  89.73333 131.33333  103.0667 187.9333

> apply(Norm,2,median)
  [1] 104.4667 106.1333 103.7333 103.0667 103.9333 104.2000 104.4667
  [8] 104.4667 104.8000 104.0667 103.6667 103.6667 107.2000 106.0000
[15] 103.6667

> apply(PM,2,median)
  RB1.CEL RB10.CEL RB11.CEL RB12.CEL RB13.CEL RB14.CEL RB15.CEL
      109       45       53       47      125      355      191
  RB2.CEL  RB3.CEL  RB4.CEL  RB5.CEL  RB6.CEL  RB7.CEL  RB8.CEL
       93       86       93      113      105       42       53
  RB9.CEL
       53

> sessionInfo()
R version 2.7.1 (2008-06-23)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets
[7] methods   base

other attached packages:
[1] canine2cdf_2.2.0     affy_1.18.2          preprocessCore_1.2.0
[4] affyio_1.8.0         Biobase_2.0.1



More information about the Bioconductor mailing list