[BioC] Peculiar behaviour of normalize.quantiles (in affy, preprocessCore) if there are NA data
Wolfgang Huber
huber at ebi.ac.uk
Tue Jul 10 19:35:12 CEST 2007
Hi all,
I noted a peculiar result from using quantile normalisation on a data
matrix that contained NA values. It creates a rather artifactual-looking
distribution of the resulting data, and I wonder whether:
- this is desired,
- if not, how it can be fixed,
- in either case, whether this is a point of general interest for people
that interpret distributions of their e.g. microarray data.
Here is some example code to reproduce:
library("geneplotter")
library("preprocessCore")
set.seed(0xbeef)
x = matrix(as.numeric(NA), nrow=20000, ncol=2)
for(i in 1:ncol(x))
x[,i] = c(rnorm(10000), runif(10000)*10)
x[ sample(nrow(x), 1000), ncol(x)] = NA
qx = normalize.quantiles(x)
par(mfrow=c(2,2))
for(what in c("x", "qx"))
for(i in 1:2)
hist(get(what)[,i], breaks=seq(-5,10, length=75),
main=sprintf("%s[,%d]", what, i),
col="orange", xlab="")
The resulting plot is here
http://www.ebi.ac.uk/~huber/quantilenormalisation/normalize.quantiles.png
I noted in the implementation in preprocessCore/src/qnorm.c that no
special consideration is made for NA values, maybe does this confuse the
algorithm?
R version 2.6.0 Under development (unstable) (2007-07-10 r42165)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] tools stats graphics grDevices datasets utils methods
[8] base
other attached packages:
[1] preprocessCore_0.99.8 geneplotter_1.15.1 lattice_0.16-1
[4] annotate_1.15.2 AnnotationDbi_0.0.78 RSQLite_0.5-4
[7] DBI_0.2-3 Biobase_1.15.17 fortunes_1.3-3
loaded via a namespace (and not attached):
[1] grid_2.6.0 KernSmooth_2.22-20 RColorBrewer_0.2-3
>
Best wishes
Wolfgang
------------------------------------------------------------------
Wolfgang Huber EBI/EMBL Cambridge UK http://www.ebi.ac.uk/huber
More information about the Bioconductor
mailing list