[BioC] Re: Use of RMA in increasingly-sized datasets

Sat Jun 4 09:47:01 CEST 2005

I am in just the same problem as yours now.

I think there are two key steps in the RMA that depend on the set of chips
in a run. One is quantile normalization step, and the other is median
polish summarization step. The target value of each quantile of probe
intensity is the geometrical mean calculated from the same qunatiles across
the entire chip set in the run. And the expression values summarized from
11-20 probe intensities are calculated from median polish algorithm using
the probe sets across the entire chip set.

Therefore, the suggestion of the usage of "a standard training 50 chip set"
is effective in practice, because the fluctuation of quantile target value
is quite a little after adding one chip data to 50 chip standard set, and
the median values used in the summarization step are robust enough for the
51 chip data set.

But this method is very tedious when we process several chip data one by
one, and to create the standard set is impossible at the beginning of a
project.

I am looking forward to hearing some good solution on this problem, too.

Bye.
Kawai

_______________________________________

Takatoshi Kawai, Ph.D.

Senior Scientist, Bioinformatics
Laboratory of Seeds Finding Technology
Eisai Co., Ltd.
5-1-3 Tokodai, Tsukuba-shi,
Ibaraki 300-2635, Japan

TEL: +81-29-847-7192
FAX: +81-29-847-7614
e-mail: t-kawai at hhc.eisai.co.jp