[BioC] Question on normalization

Johannes Hüsing johannes.huesing@ruhrau.de
Tue, 18 Feb 2003 06:56:29 +0100


Gordon Barr <gab5@columbia.edu> [/91/Mon, Feb 17, 2003 at 12:43:10PM -0500]:
> To Bioconductor
> 
> We have used the quantile normalization method in Bioconductor and would
> like to know if there is consensus (or if not what are the opinions) about
> whether or not to normalize all experimental conditions and controls
> together or to analyze each group separately. We have two experimental
> groups and one control with multiple replicates for each from different
> animals.  The three conditions were run at the same time for each replicate
> to minimize variability between groups. It seems to us that if we normalize
> both experimental groups and controls together we will bias the results
> against ourselves, even if most gene expression levels are unchanged.
> 

I gather you want to do quantile normalization on all groups
separately, ie, normalize towards different cdf (cumulative
distribution functions, not cell description files) of probe level
intensities. 

To have an illustrated example (not to assume it is the case in your
setting but to see if the technique works under a given situation),
consider one set of expressions in one chip is heavily biased upwards
(by using more RNA or dye or whatever). If you do normalization per
group, all genes within that group will look slightly upregulated
against other groups.

My feeling is that separate normalization is even worse when the three
intensity CDFs have a different skew, so intensities that show on the
most skewed group will be stressed in that group.

So I'd go for overall normalization if my computational facilities
allow so.

-- 
Johannes Hüsing   There is something fascinating about science. One gets
hannes@ruhrau.de  such wholesale returns of conjecture from such a 
                  trifling investment of fact.                Mark Twain