[BioC] Re: Logit-t vs RMA
Rafael A. Irizarry
ririzarr at jhsph.edu
Fri Sep 26 02:13:21 MEST 2003
i havent had time to read this paper carefully. but here are some minor
comments from what i saw:
1) if i understood correctly, they compare their test to the t-test
(for RMA, dChip, and MAS). which in this data set implies they are doing
3 versus 3 comparisons (is this right?)
With an N=3 the t-test has very little
power. In fact, we find that with N=3, in these data (affy spikein
etc...), average log fold change outperforms the t-test dramatically. the
SAM statistic does even better:
http://biosun01.biostat.jhsph.edu/~ririzarr/badttest.png
notice in the posted figure that for high specificity (around 100 false
positives), avg log fc give twice as many true positives.
so their conclusion should not be that logit-t is better than RMA but
rather that logit-t is a better test than a t-test when N=3 regardless of
expression measure. not an impressive feat. RMA is not a test, its an
expression measure. one can build tests with RMA. some will be better than
others. judging by their ROC, RMA using the SAM stat or simply the avg log
fc stat would outperform the logit-t.
2 - another problem i found is the use of the PPV for just one cut-off as
an assessment. ROC curves where both true positive (TP) and false
positives (FP) are shown are much more informative (notice TP and FP can
be calculated easily from the rates if one knows the number of spiked in
genes and total number of genes in the array). the PPV can be computed for
any cutoff or point in the ROC curve: TP/(FP+TP). In affycomp
(http://affycomp.biostat.jhsph.edu) we show
ROC curves where the FP go only up to 100 since having lists with more
than 100 FP is not practical. see Figure 5 here:
http://biosun01.biostat.jhsph.edu/~ririzarr/papers/affycomp.pdf
When computes a t-test and uses a p-value of 0.01 as the threshhold one
is way outside this bound. So IMHO, Table 1 in their paper is misleading.
because the ROC curve flattens very quickly, if
one changed the p-value cut-off to 0.001 then the FPs for both RMA and
dChip will reduce dramatically but the TPs wont reduce too much. this is
why it is more informative to show ROC curves as opposed to just a
number based on one point in the ROC curve. if a one number summary is
needed, the area under the ROC curve or the ROC convex hull are much
better summaries than just one PPV.
the ROC curves shown in this paper go up to rates of 0.4 (5000 FP). for
such tight comparisons, this should really go up to around 0.01 (100 FP)
so one can see the area of interest. in our NAR paper we show rates up to
1.0 but this is because the comparisons where not tight at all.
3 - a minor mistake is that they incorrectly state that affy;s spikein is
done in the hgu95av2 chip. it was done on the hgu95a chip.
4- finally
On Thu, 25 Sep 2003, Eric wrote:
<SNIP>
> Lemon et al. developed an interesting and possibly improved gauge of
> confidence called the positive predictive value (PPV) that may be useful
the positive predictive value (PPV) is a term that has been around for
decades. in medical language:
"the positive predictive value of a test is the probability that the
patient has the disease when restricted to those patients who test
positive." a simple estimate is TP/(TP+FP).
hope this helps,
rafael
> for future scientists looking to test their low level algorithms on known
> data sets, but the heart of the paper has to do with their idea on
> transforming the intensity values.
>
> The authors set out, using a variation on Langmuir's adsorption isotherm
> (that is, the classic semi-log sigmoidal dose-response relationship) to
> transform the intensity values of individual probes on the array. To me,
> this makes more biological sense than some other procedures because it is
> based on the ligand-receptor relationship between the probes and the mRNA
> species to which they are designed to hybridize.
>
> However, when the authors combined their transformed feature level
> information into a single measure per probe set, they found that their
> procedure (Logit-Exp and Logit-ExpR) performed no better than RMA or
> dChip.
>
> Interestingly, if they DID NOT collapse their probe level data into a
> single probe_set value, and instead tested across all probes (logit-t),
> their transformed data did a much better job of winnowing the wheat from
> the chaff. They concluded that "...the modeling paradigm may cause the
> loss of information from the probe-level data"..
>
> This seems critical to me, there is a huge discrepancy between the
> significant gene lists generated with different probe level algorithms,
> and I don't believe we'll be able to understand why that dichotomy exists
> until we look at the underlying probe level information.
>
> I have been pleading with our stats department for over a year (I am just
> a neuroscientist and I write code like a hippopotamus roller skates) to
> employ a 2-way ANOVA on repeated measures at the probe level to test for
> significance, and in fact went so far as to put the notion (with some
> sample data) into a book chapter I authored earlier this year (Chapter 6:
> in A Beginner's Guide to Microarrays).
>
> The authors state that "the combination of logit transformation and
> probe-level statistical testing provides a means for greatly improved
> PPV...". I would agree, but add the caveat that the comparison, at the
> probe level, on untransformed values has yet to be done, thus the probe
> level idea may be more important than the transformation notion to
> improved PPV. Other methods have looked at the probe level information
> (e.g., Liu et al. 2002- Affymetrix multiple pairwise comparison- but
> their use of the feature level data as biological n may be inapropriate;
> and Zhang et al., 2002- but their intention was only for a two chip
> comparison).
>
> I believe that it is unfortunate that the authors resort to fold change
> as a final discriminator after all of that hard work, rather than a
> formal statistical test. I still feel that 2-way ANOVA on repeated
> measures is the right test for this, but would love to hear from others.
>
> -E
>
> P.S. My apologies to Lemon et al if I have misrepresented/ misunderstood
> your work. I will gladly retract/ correct this (or any part of it) at
> your request.
>
> At 12:00 PM 9/25/2003 +0200, you wrote:
>
> Message: 1
> Date: Wed, 24 Sep 2003 19:46:04 +0200
> From: "Dario Greco" <greco at biogem.it>
> Subject: [BioC] ...Logit-t vs RMA...
> To: "Bioconductor" <bioconductor at stat.math.ethz.ch>
> Message-ID: <002601c382c3$bd4a42a0$ce3ca48c at neo>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi to everybody,
> I've just red some days ago the new paper on "logit-t" method
> to analyze
> affy chips.
>
> --------------------------------------------------------------
> "A high performance test of differential gene expression for
> oligonucleotide arrays"
>
> William J Lemon, Sandya Liyanarachchi and Ming You
>
> Genome Biology 2003, 4:R67
> --------------------------------------------------------------
>
> What do you think about this?
>
> Regards
> Dario
>
> --------------------------------------------
> Dario Greco
> Institute of Genetics and Biophysics
> "Adriano Buzzati Traverso" - CNR
> 111, Via P.Castellino
> 80131 Naples, Italy
> phone +39 081 6132 367
> fax +39 081 6132 350
> email: greco at igb.cnr.it; greco at biogem.it
>
> Eric Blalock, PhD
> Dept Pharmacology, UKMC
> 859 323-8033
>
> STATEMENT OF CONFIDENTIALITY
>
> The contents of this e-mail message and any attachments are confidential
> and are intended solely for addressee. The information may also be
> legally privileged. This transmission is sent in trust, for the sole
> purpose of delivery to the intended recipient. If you have received this
> transmission in error, any use, reproduction or dissemination of this
> transmission is strictly prohibited. If you are not the intended
> recipient, please immediately notify the sender by reply e-mail or at
> (859) 323-8033 and delete this message and its attachments, if any.
>
More information about the Bioconductor
mailing list