[BioC] iterative gcrma on 92 slides?
Francois Collin
fcollin at sbcglobal.net
Wed Jan 28 20:05:24 MET 2004
Having to analyze large sets of chips on a PC equipped with little memory, I had to deal with this issue all the time. Getting rma or gcrma expression summaries is typcally viewed as a three step process: background correction, normalization and model fitting. The latter step, can be further divided into two: model estimation, followed by model application (ie. using the fitted model to obtain expression summarries from background corrected, normalized cell intensities).
Background correction is really a single chip operation so you can break up your set into manageable pieces, apply the background correction, and save the background corrected intensity matrices to disk.
Quantile normalization pools chips together to obtain a target normalization vector. Once you have that vector normalization also becomes a one chip at a time operation. There is no need to use 92 chips to get that vector - just pick a random set.
The model fitting step requires fitting a model to a set of chips. You do not have to use all 92 chips to fit this model. You can pick a random set of chips to estimate the model parameters. Once you have the fitted model (models really, one per probe set), computing expression summaries becomes a one chip at a time operation. If your chips naturally come in batches, it might be a good idea to use one batch to fit the model, and apply this model to all batches. Examining residuals will clearly tell you if you have differences between batches.
Feel free to e-mail me if I can help.
-francois
Dick Beyer <dbeyer at u.washington.edu> wrote:
I would like to do gcrma on 92 chips, type mgu74av2. I know I won't be able to if I just send all 92 to gcrma.
I was wondering if there were a way to iteratively get to the answer. Would it be useful to do gcrma in doable batches, say 20 or so, then do quantile normalization on all 92?
Or can I do justRMA on the 92, then somehow process the exprSet to achieve a gcrma result?
I am willing to try different approaches, but I am hoping for some advice on which ways might be best.
Thanks very much,
Dick
*******************************************************************************
Richard P. Beyer, Ph.D. University of Washington
Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100
Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
[[alternative HTML version deleted]]
More information about the Bioconductor
mailing list