[BioC] Question about quantile normalization

K J hamuteru_kimiteru at wind.ocn.ne.jp
Tue Aug 2 15:41:26 CEST 2011


Dear Dr. Laurent Gautier,

Thanks for your kind reply.

 > Quantile normalization is usually performed on untransformed data
("raw-scale").

I read some journals on RMA or RMA++, and I made sure that raw-scale is
usually used for quantile normarlization as you say.

 > Missing values are just ignored and left as such (missing values).

I'm afraid I don't quite understand what you say on how to treat the N/ 
A.
Based on the program written in "normalizeBetweenArrays", it seems  
that the
N/A is first replaced with the median, but after ranking, averaging and
re-ordering to the original position, it's transformed back to N/A.  Is
this correct?

I described the example herein:

step 1: Calculating median in each column

 > ngenes <- 5
 > narrays <- 2

 > x <- matrix(c(1:10),ngenes,narrays)
 > x[2,1] <- x[3,1] <- NA
 > x[4,2] <- x[5,2] <- NA

 > x
      [,1] [,2]
[1,]    1    6
[2,]   NA    7
[3,]   NA    8
[4,]    4   NA
[5,]    5   NA

 > (xm <- apply(x,2,median,na.rm=T))

[1] 4 7

step 2: Replacing N/A in each column with median

 > x[2,1] <- x[3,1] <- xm[1]
 > x[4,2] <- x[5,2] <- xm[2]

      [,1] [,2]
[1,]    1    6
[2,]    4    7
[3,]    4    8
[4,]    4    7
[5,]    5    7

step 3: Sorting values in each column in descending order

      [,1] [,2]
[1,]    5    8
[2,]    4    7
[3,]    4    7
[4,]    4    7
[5,]    1    6

step 4: Averaging values in each rank

      [,1] [,2]    Average
[1,]    5    8    6.5
[2,]    4    7    5.5
[3,]    4    7    5.5
[4,]    4    7    5.5
[5,]    1    6    3.5

step 5: Replacing the values in each column with the average

        [,1]   [,2]      Average
[1,]    6.5    6.5      6.5
[2,]    5.5    5.5      5.5
[3,]    5.5    5.5      5.5
[4,]    5.5    5.5      5.5
[5,]    3.5    3.5      3.5


step 6: Re-sorting the values in each column at original positions

        [,1]   [,2]
[1,]    3.5    3.5
[2,]    5.5    5.5
[3,]    5.5    6.5
[4,]    5.5    5.5
[5,]    6.5    5.5

step 7: Replacing the values with N/A at original positions

        [,1]   [,2]
[1,]    3.5    3.5
[2,]    NA     5.5
[3,]    NA     6.5
[4,]    5.5    NA
[5,]    6.5    NA

This result corresponds to normalizeBetweenArrays() result;

 > x <- matrix(c(1:10),ngenes,narrays)
 > x[2,1] <- x[3,1] <- NA
 > x[4,2] <- x[5,2] <- NA
 > (y <- normalizeBetweenArrays(x))

        [,1]   [,2]
[1,]    3.5    3.5
[2,]    NA     5.5
[3,]    NA     6.5
[4,]    5.5    NA
[5,]    6.5    NA

J, K

--- On Mon, 2011/8/1, Laurent Gautier <laurent at cbs.dtu.dk> wrote:


On 2011-08-01 06:55, qwertyui_period at yahoo.co.jp wrote:
 > Dear all,
 >
 > My environment is limma Version 3.2.2, R version 2.10.1, and  
Windows XP.
 > I'm going to normalize the microarray data by  
"normalizeBetweenArrays"
 > which is the quantile normalization function in "limma" package.
 > I have read the "usersguide.pdf" in bioconductor website, however,  
I still
 > have two questions.
 >
 > Question 1: Which is proper to use for quantile normalization: raw- 
scale or
 > log2-scale values ?
 > The quantile normalization includes the step of calculating  
arithmetic
 > mean,
 > so I suppose the raw-scale values should be used, though the  
microarray
 > data is generally log2-scale values.

Quantile normalization is usually performed on untransformed data
("raw-scale").
Log2 transformation comes after (before probe summary when using RMA or
RMA-like approaches).

 > Question 2: How does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B  
in data ?

Missing values are just ignored and left as such (missing values).


Hoping this helps,



L.

 > Example code 1,
 >
 >> ngenes<- 3
 >> narrays<- 2
 >> x<- matrix(c(3,1,5,6,4,2),ngenes,narrays)
 >       [,1] [,2]
 > [1,]    3    6
 > [2,]    1    4
 > [3,]    5    2
 >
 >> (y<- normalizeBetweenArrays(x))
 >       [,1] [,2]
 > [1,]  3.5  5.5
 > [2,]  1.5  3.5
 > [3,]  5.5  1.5
 >
 > I understand the process of "normalizeBetweenArrays" is devided  
into 4
 > steps as follows:
 >
 > step 1: Sorting values in each column in descending order
 >
 >       [,1] [,2]
 > [1,]    5    6
 > [2,]    3    4
 > [3,]    1    2
 >
 > step 2: Averaging values in each rank
 >
 >       [,1] [,2]    Average
 > [1,]    5    6    5.5
 > [2,]    3    4    3.5
 > [3,]    1    2    1.5
 >
 > step 3: Replacing the values in each column with the average
 >
 >        [,1]  [,2]  Average
 > [1,]  5.5   5.5   5.5
 > [2,]  3.5   3.5   3.5
 > [3,]  1.5   1.5   1.5
 >
 > step 4: Re-sorting the values in each column at original positions
 >
 >       [,1] [,2]
 > [1,]  3.5  5.5
 > [2,]  1.5  3.5
 > [3,]  5.5  1.5
 >
 > Then, how does "normalizeBetweenArrays" deal $B!H(BN/A$B!I(B in  
data ?
 >
 > Example code 2,
 >
 >> (x<- matrix(c(NA,1,5,6,4,2),ngenes,narrays))
 >       [,1] [,2]
 > [1,]   NA    6
 > [2,]    1    4
 > [3,]    5    2
 >
 > (y<- normalizeBetweenArrays(x))
 >
 >       [,1] [,2]
 > [1,]   NA  5.5
 > [2,]  1.5  3.5
 > [3,]  5.5  1.5
 >
 > Thanks in advance !
 >
 >
 >     [[alternative HTML version deleted]]
 >
 > _______________________________________________
 > Bioconductor mailing list
 > Bioconductor at r-project.org
 > https://stat.ethz.ch/mailman/listinfo/bioconductor
 > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list