[BioC] median polish vs mas

Wed May 19 04:10:35 CEST 2004

a couple of points to add to jim's response

1- robust fits of linear models in the log scale offer somthing quantile
normalization does not and it is the removal of outliers that appear to be
outliers only when one looks across chip ( see li and wong's pnas
2001 paper). median polish is a quick and dirty way of doing this. if you
want something fancier you can use the affyplm package to perform 
formal robust procedures. 
i havent found a procedure that clearly beats median polish as judged by
affycomp. the key is that one fits 
multiple ararays and takes advatange of the probe effect to find outliers. 

2-rma was tuned to the genelogic spike-in. to avoid the effect of over
training we assessed rma on the genelogic dilution and affymetix spike-in
experiments.

On Mon, 17 May 2004, James MacDonald wrote:

> Dear Naomi,
> 
> I think we are talking about two different things here. Your question
> appears to be whether or not rma is a reasonable method for computing
> expression values, and you appear not to distinguish between justRMA and
> rma. My statement is directed towards the purpose of justRMA.
> 
> To answer your question, I personally like rma, and I am not convinced
> that there is any over-normalization occuring by doing a quantile
> normalization followed by medianpolish. I have tried pretty much
> everything out there, and I have yet to find a method for computing
> expression values that I think does a better job in general use. This is
> based primarily on how well a given method works with the affy spike-in
> and GeneLogic dilution data sets (I have had arguments with other
> statisticians who think that rma only works as well as it does with
> these data sets because it has been specifically 'tuned' for them. If
> so, my hat is off to Rafael and Ben for their ability to come up with an
> algorithm that can magically pick the 16 spiked-in genes out of the
> other 18,000 or so other genes...).
> 
> For a variety of reasons, not the least of which is the fact that rma
> 'beats' most other methods, rma has sort of become the canonical method
> for computing expression values for Affy data. It has been implemented
> in other non-BioC packages such as GeneSpring, etc, and although I
> haven't seen anything concrete, I would bet dollars to donuts that the
> Affy PLIER algorithm is simply rma by another name. I think this is why
> your reviewer wants to know why you are doing quantile normalization
> followed by Tukey's biweight instead of what he/she would consider to be
> the 'usual' method.
> 
> Now to the point I was originally trying to make. One of the problems
> that people encounter with rma is the fact that you first have to create
> an AffyBatch with all of your chips, and then compute expression values
> which are stored in an exprSet. This can take a huge amount of RAM, and
> people with maybe 512 Mb of RAM (which is plenty for the vast majority
> of things you will ever do on a computer) were running out of memory
> with a relatively small number of chips. Rafael noted that a
> modification could be made to rma that would use much less memory, and
> with his help I wrote the original justRMA. This function was designed
> for one purpose only; to allow people with less RAM to be able to do
> rma.
> 
> The decision to use medianpolish wasn't arbitrary at all; justRMA is
> designed to give the exact same results as rma (which of course uses
> medianpolish to compute expression values), so by default I had to use
> medianpolish.
> 
> Best,
> 
> Jim
> 
> 
> 
> James W. MacDonald
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
> 
> >>> Naomi Altman <naomi at stat.psu.edu> 05/17/04 11:33AM >>>
> Dear Jim,
> 
> The reason I ask is that I have been using expresso with "mas".  But I
> recently had a paper returned with the comment that median polish was
> "known to be better".  If so, I should probably use it.  The reviewer
> appears to have based his/her remarks on the fact (mentioned in the
> review) that median polish is the "default".
> 
> If the decision to use median polish in justRMA was arbitrary, I would
> like to know this, since I am currently in the process of redoing all of
> the statistical analyses and tables in the paper (which is pretty
> time-consuming).  The main reason we are redoing everything, rather than
> defending our decision to use "mas" is that I certainly have no evidence
> that Tukey's biweight is "better" except for the heuristic about
> over-normalization, and I figured in the long run we will have fewer
> arguments with reviewers if we use the default. 
> 
> I should not have said that median polish is the "default" in justRMA,
> since it is the only method available, but I do think that its use in
> justRMA is an endorsement meaning that anyone doing anything besides
> Affy-type MAS5 or justRMA or justGCRMA (if this is available) is going
> to be asked to justify what they are doing with more stringency.
> 
> --Naomi
> 
> 
> 
> At 10:49 AM 5/17/2004, James MacDonald wrote:
> The default (and only) option for justRMA is medianpolish because
> justRMA is designed to *just* do *RMA*, which is a quantile
> normalization followed by medianpolish. The only reason justRMA exists
> is to allow people with less RAM to be able to do rma.
> 
> If you think a quantile normalization followed by Tukey's biweight
> will
> do better than rma, you can certainly do that using the expresso()
> function.
> 
> Best,
> 
> Jim
> 
> 
> James W. MacDonald
> Affymetrix and cDNA Microarray Core
> University of Michigan Cancer Center
> 1500 E. Medical Center Drive
> 7410 CCGC
> Ann Arbor MI 48109
> 734-647-5623
> 
> >>> Naomi Altman <naomi at stat.psu.edu> 05/17/04 10:15AM >>>
> I have been wondering why the default in justRMA is 
> summary.method="medianpolish"  instead of "mas" which is Tukey's 
> biweight.  Since we are already doing quantile normalization, doesn't
> the 
> extra between array step imposed by median polish give the possibility
> of 
> masking differential expression?
> 
> Naomi S. Altman                                814-865-3791 (voice)
> Associate Professor
> Bioinformatics Consulting Center
> Dept. of Statistics                              814-863-7114 (fax)
> Penn State University                         814-865-1348
> (Statistics)
> University Park, PA 16802-2111
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch 
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor 
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch 
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
> Naomi S. Altman                                814-865-3791 (voice) 
> Associate Professor 
> Bioinformatics Consulting Center
> Dept. of Statistics                              814-863-7114 (fax) 
> Penn State University                         814-865-1348 (Statistics)
> 
> University Park, PA 16802-2111
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>