[BioC] Terminology was RE: RMA normalization

Ben Bolstad bolstad at stat.berkeley.edu
Thu Sep 16 03:05:29 CEST 2004

On Wed, 2004-09-15 at 11:27, w.huber at dkfz-heidelberg.de wrote:
> As I said above, I do not believe this is specific to RMA or GCRMA, but
> rather a general problem for all normalization methods.

I hate to be pedantic, but really people should be careful about how
they utilize the term "normalization".  These comments are not aimed at
any particular individual(s), but I note that there is a trend
particularly among those who come to Affymetrix style microarrays via
the two color world to use "normalization" to apply the entire sequence
of pre-processing the data. ie

"I normalized my data using RMA" or "I have GCRMA normalized data" or "I
used dChip normalization" .....

It is more precise to substitute some form of the term "pre-process" in
for "normalization" in the above. Also, perhaps it is better to really
talk about having "expression values". ie

"RMA expression values","MAS5 expression values", "dChip expression
values", "GCRMA expression values".....

Why is any of this important? Because normalization usually refers to
something more specific. Many people like to think of the process of
going from raw probe-intensity data to expression values as a process
involving background adjustment, normalization and summarization steps.
In this context "normalization" refers to the process of reducing
unwanted technical variation. It so happens that in the case of RMA and
GCRMA this procedure happens to be quantile normalization.

Anyway, that is my opinion on the matter


Ben Bolstad <bolstad at stat.berkeley.edu>

More information about the Bioconductor mailing list