[BioC] preprocessing for tiling arrays (preprocessCore package)
Ann Hess
hess at stat.colostate.edu
Mon Jul 21 19:13:33 CEST 2008
I would like to perform RMA style preprocessing for tiling arrays. Based
on previous discussion (thanks Ben and Jim!), it seems that the functions
rma.background.correct() and normalize.quantiles() from the
preprocessCore package are the easiest way to accomplish this.
First of all, I am having trouble getting the functions to work. The
rma.background.correct() function seems to be returning the original
values. The normalize.quantiles() function does something to the data,
but the medians are not equal across arrays (which I would have expected
from quantiles normalization).
Secondly, I wanted to verify that to most closely match the RMA algorithm,
I should (1) restrict to PM values (2) background correct, (3) normalize,
then (4) log2 transform. Is this correct?
The code (using Canine2 arrays for quicker experimentation) and
sessionInfo are below.
Ann
**************************************
> library(affy)
> library(preprocessCore)
> AllArrays<-ReadAffy()
> PM<-pm(AllArrays)
> dim(PM)
[1] 473162 15
> PM[1:10,1:5]
RB1.CEL RB10.CEL RB11.CEL RB12.CEL RB13.CEL
571209 134 52 56 67 309
571210 151 63 78 85 587
571211 388 116 134 129 1396
571212 215 56 68 67 530
571213 83 41 42 47 131
571214 83 42 41 50 148
571215 107 43 42 63 194
571216 216 48 69 83 620
571217 254 61 80 79 876
571218 131 42 61 47 246
> BG<-rma.background.correct(PM)
> dim(BG)
[1] 473162 15
> BG[1:10,1:5]
RB1.CEL RB10.CEL RB11.CEL RB12.CEL RB13.CEL
571209 134 52 56 67 309
571210 151 63 78 85 587
571211 388 116 134 129 1396
571212 215 56 68 67 530
571213 83 41 42 47 131
571214 83 42 41 50 148
571215 107 43 42 63 194
571216 216 48 69 83 620
571217 254 61 80 79 876
571218 131 42 61 47 246
> Norm<-normalize.quantiles(BG)
> dim(Norm)
[1] 473162 15
> Norm[1:10,1:5]
[,1] [,2] [,3] [,4] [,5]
[1,] 128.4667 143.40000 114.06667 332.7333 230.0000
[2,] 145.0000 202.66667 188.80000 560.2000 413.3333
[3,] 383.6000 480.93333 368.73333 1110.5000 936.6333
[4,] 208.6000 164.80000 155.20000 332.7333 375.8667
[5,] 79.6000 84.40000 65.26667 103.0667 108.0667
[6,] 79.6000 89.73333 62.00000 131.9333 120.0000
[7,] 102.3333 95.26667 65.26667 283.1333 151.9333
[8,] 209.4000 122.13333 158.80000 534.5333 435.6000
[9,] 247.5333 191.73333 195.80000 482.5333 602.5667
[10,] 125.7333 89.73333 131.33333 103.0667 187.9333
> apply(Norm,2,median)
[1] 104.4667 106.1333 103.7333 103.0667 103.9333 104.2000 104.4667
[8] 104.4667 104.8000 104.0667 103.6667 103.6667 107.2000 106.0000
[15] 103.6667
> apply(PM,2,median)
RB1.CEL RB10.CEL RB11.CEL RB12.CEL RB13.CEL RB14.CEL RB15.CEL
109 45 53 47 125 355 191
RB2.CEL RB3.CEL RB4.CEL RB5.CEL RB6.CEL RB7.CEL RB8.CEL
93 86 93 113 105 42 53
RB9.CEL
53
> sessionInfo()
R version 2.7.1 (2008-06-23)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] tools stats graphics grDevices utils datasets
[7] methods base
other attached packages:
[1] canine2cdf_2.2.0 affy_1.18.2 preprocessCore_1.2.0
[4] affyio_1.8.0 Biobase_2.0.1
More information about the Bioconductor
mailing list