[BioC] Artifact in RMA-normalized data
Mary Putt
mputt at cceb.upenn.edu
Thu Mar 4 19:15:17 MET 2004
Hi,
I normalized the data from 13 arrays (6 group H and 7 group P) using
rma. I found that the arrays from my H group were systematically lower
than from the P group at the lower end of the expression scale, while my
arrays from the H group were higher than the arrays from the P group at
the high end of the scale. The differences are subtle but they show up
in the MVA plots, as well as in the summary statistics seen below. I had
a Warning message from Affy during the normalization. It doesn't seem to
me that RMA should introduce this type of artifact--unless there's
something about the warning message that I don't understand. I'm
wondering if anyone has insights on this. Thanks Mary
########
#Program to normalize the data
############
library(affy)
load('all.Rdata')
allnorm<-expresso(alldata, bgcorrect.method='rma',
normalize.method='quantiles.robus', pmcorrect.method='pmonly',
summary.method='me
dianpolish')
exprs.allnorm<-exprs(allnorm)
save(exprs.allnorm, file='exprs.allnorm.Rdata')
#########
#Warning msg following normalization
############
#> source('expresso.all.r')
#background correction: rma
#normalization: quantiles.robus
#PM/MM correction : pmonly
#expression values: medianpolish
#background correcting...done.
#normalizing...Chip weights are 1 1 1 1 1 1 0 1 1 1 1 1 1 1
#Chip weights are 1 1 1 1 1 1 1 1 0 1 1 1 1 1
#done.
#22283 ids to be processed
#.........
#Warning messages:
#1: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "variance") {
#2: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "mean") {
#3: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "both") {
#4: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "variance") {
#5: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "mean") {
#6: the condition has length > 1 and only the first element will be used
in: if (remove.extreme == "both") {
#####################
#descriptive statistics of normalized data
#
#note that h and p are different groups
##################
>
> summary
h1 h2 h3 h4 h5 h6 p1 p2
p3 p4
Min 2.997 3.008 3.051 3.010 2.967 3.005 3.123 3.057 3.119
3.102
1stQrtl 4.719 4.679 4.762 4.739 4.771 4.771 4.895 4.717 4.926
4.891
Median 5.924 5.901 5.950 5.970 5.970 5.961 6.015 5.942 6.015
6.018
Mean 6.165 6.143 6.150 6.171 6.182 6.178 6.167 6.163 6.172
6.162
3rdQrtl 7.291 7.300 7.266 7.358 7.316 7.288 7.201 7.281 7.216
7.224
Max 13.310 13.620 13.760 13.800 13.660 13.660 13.800 13.790 13.670
13.660
p5 p6 p7
Min 3.121 3.017 3.041
1stQrtl 4.938 4.829 4.835
Median 6.031 5.993 6.015
Mean 6.172 6.166 6.168
>apply(summary[,1:6], 1, median)
Min 1stQrtl Median Mean 3rdQrtl Max
3.0065 4.7505 5.9555 6.1680 7.2955 13.6600
> apply(summary[,7:13], 1, median)
Min 1stQrtl Median Mean 3rdQrtl Max
3.102 4.891 6.015 6.167 7.220 13.660
> apply(summary[,1:6], 1, mean)
Min 1stQrtl Median Mean 3rdQrtl Max
3.006333 4.740167 5.946000 6.164833 7.303167 13.635000
> apply(summary[,7:13], 1, mean)
Min 1stQrtl Median Mean 3rdQrtl Max
3.082857 4.861571 6.004143 6.167143 7.227143 13.602857
--
Mary E. Putt
Assistant Professor of Biostatistics
Department of Biostatistics and Epidemiology
Center for Biostatistics and Epidemiology
School of Medicine, University of Pennsylvania,
621 Blockley Hall
423 Guardian Drive
Philadelphia, PA
19104-6021
Ph. (215) 573-7020
Fax (215) 573-4865
More information about the Bioconductor
mailing list