[BioC] normalize.AffyBatch.quantiles and NA values

Wed Jul 5 18:16:08 CEST 2006

Your understanding of what the code is doing is correct. eg

> data(affybatch.example)
> pm(affybatch.example)[1,]
20A 20B 10A
149 118 124
> pm(affybatch.example)[1,1] <- NA
> affybatch.example.norm <- normalize(affybatch.example)
> pm(affybatch.example)[1:3,]
       20A   20B   10A
[1,]    NA 118.0 124.0
[2,] 143.5 124.8 116.5
[3,] 132.0 111.0 105.0
> pm(affybatch.example.norm)[1:3,]
          20A      20B      10A
[1,]       NA 118.0000 124.0000
[2,] 127.2667 137.3667 120.1667
[3,] 115.8333 122.6000 107.3333
>

Note that normalize calls normalize.AffyBatch.quantiles by default in
this case. 

Why does it do this? Because the underlying routine which does the
normalization normalize.quantiles() does not handle NA values. In some
sense this was done by design, the thinking being that if you start off
with raw CEL file data there should not be any missing data.

Ben

On Fri, 2006-06-30 at 09:33 -0500, odlc at uchicago.edu wrote:
> Hello,
> 
> I am trying to understand how normalize.AffyBatch.quantiles
> works. From what I understand of the code, it seems that
> rows (corresponding to a probe) which contain even a single
> NA are dropped; then, the quantile-normalization method 
> described in Botstein et al. is applied to the remaining rows,
> and these normalized rows are put back into the original
> batch. 
> 
> In other words, values in a row that contains NA's
> remain unchanged.
> 
> Questions:
> 
> - Is this really how it works?
> 
> - If yes, is this the intended behavior?
> 
> Thank you,
> 
> Omar.
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
-- 
Ben Bolstad <bmb at bmbolstad.com>
http://bmbolstad.com