[BioC] Re: Logit-t vs RMA

Fri Sep 26 11:37:16 MEST 2003

I knew when I saw your reply in my mailbox that I was in trouble :/

Thanks for the clarification- I hadn't looked carefully at the PPV 
references, but I believe I may have done the authors a disservice by 
implying that they had developed PPV, rather than adopting it (this was my 
mistaken impression and not their claim). I'd leave it to the authors to 
address the short falls you point out, and I agree that the paper may 
inappropriately/ unfairly compare logit-t to RMA and dChip.

My enthusiasm regarding the paper is that it is the first one (to my 
knowledge) that has interrogated probe_set differences at the probe level 
across groups (rather than pairwise), especially since the authors reported 
what I suspected- that the probe level data does a better job than the 
expression values generated by a probe level algorithm (I could still be 
totally wrong, but it IS what I have been suspecting). This may be more 
important for discerning meaningful differences than the transforms 
themselves. If it is possible to interrogate the data at this level, why 
bother with probe level algorithms which may lose information in the 
process of cooking 11-32 intensity values into a single number? Personally, 
I'd be willing to tolerate the extra statistical complexity to get results 
that more accurately reflect the biological processes under investigation.

I realize that the data set you had to work with was relatively small (n = 
3/ group), but would you still advocate average log fold change as a 
discriminator if you had say 10 chips in each group? While it would 
probably still work well for spike-in data, how would it do on real live 
biological samples? We are working very hard to dispel the notion that one 
can find accurate microarray results with an n of 3 per group, particularly 
in animal studies. This has never been acceptable in univariate work (at 
least in our field of research) and there is nothing magical about 
microarray technology that makes Gene Chips more capable of assessing 
biological variance when only a few biological replicates are present. In 
our grant writing, comments to other researchers in the neurosciences, and 
advice from our microarray core, we are strongly advocating sufficient 
replication and statistical determination of significant differences.

Cheers,
-E

At 01:13 AM 9/26/2003 -0400, you wrote:
>i havent had time to read this paper carefully. but here are some minor
>comments from what i saw:
>
>1) if i understood correctly, they compare their test to the t-test
>(for RMA, dChip, and MAS). which in this data set implies they are doing
>3 versus 3 comparisons (is this right?)
>With an N=3 the t-test has very little
>power. In fact, we find that with N=3, in these data (affy spikein
>etc...), average log fold change outperforms the t-test dramatically. the
>SAM statistic does even better:
>
>http://biosun01.biostat.jhsph.edu/~ririzarr/badttest.png
>
>notice in the posted figure that for high specificity (around 100 false
>positives), avg log fc give twice as many true positives.
>
>so their conclusion should not be that logit-t is better than RMA but
>rather that logit-t is a better test than a t-test when N=3 regardless of
>expression measure. not an impressive feat. RMA is not a test, its an
>expression measure. one can build tests with RMA. some will be better than
>others. judging by their ROC, RMA using the SAM stat or simply the avg log
>fc stat would outperform the logit-t.
>
>2 - another problem i found is the use of the PPV for just one cut-off as
>an assessment. ROC curves where both true positive (TP) and false
>positives (FP) are shown are much more informative (notice TP and FP can
>be calculated easily from the rates if one knows the number of spiked in
>genes and total number of genes in the array). the PPV can be computed for
>any cutoff or point in the ROC curve: TP/(FP+TP). In affycomp
>(http://affycomp.biostat.jhsph.edu) we show
>ROC curves where the FP go only up to 100 since having lists with more
>than 100 FP is not practical. see Figure 5 here:
>
>http://biosun01.biostat.jhsph.edu/~ririzarr/papers/affycomp.pdf
>
>When computes  a t-test and uses a p-value of 0.01 as the threshhold one
>is way outside this bound. So IMHO, Table 1 in their paper is misleading.
>because the ROC curve flattens very quickly, if
>one changed the p-value cut-off to 0.001 then the FPs for both RMA and
>dChip will reduce dramatically but the TPs wont reduce too much. this is
>why it is more informative to show ROC curves as opposed to just a
>number based on one point in the ROC curve. if a one number summary is
>needed, the area under the ROC curve or the ROC convex hull are much
>better summaries than just one PPV.
>
>the ROC curves shown in this paper go up to rates of 0.4 (5000 FP). for
>such tight comparisons, this should really go up to around 0.01 (100 FP)
>so one can see the area of interest. in our NAR paper we show rates up to
>1.0 but this is because the comparisons where not tight at all.

3 - a minor mistake is that they incorrectly state that affy;s spikein is
done in the hgu95av2 chip. it was done on the hgu95a chip.

4- finally

On Thu, 25 Sep 2003, Eric wrote:

<SNIP>
 > Lemon et al. developed an interesting and possibly improved gauge of
 > confidence called the positive predictive value (PPV) that may be useful

the positive predictive value (PPV) is a term that has been around for
decades. in medical language:
"the positive predictive value of a test is the probability that the
patient has the disease when restricted to those patients who test
positive." a simple estimate is TP/(TP+FP).

hope this helps,
rafael

> > for future scientists looking to test their low level algorithms on known
> > data sets, but the heart of the paper has to do with their idea on
> > transforming the intensity values.
> >
> > The authors set out, using a variation on Langmuir's adsorption isotherm
> > (that is, the classic semi-log sigmoidal dose-response relationship) to
> > transform the intensity values of individual probes on the array. To me,
> > this makes more biological sense than some other procedures because it is
> > based on the ligand-receptor relationship between the probes and the mRNA
> > species to which they are designed to hybridize.
> >
> > However, when the authors combined their transformed feature level
> > information into a single measure per probe set, they found that their
> > procedure (Logit-Exp and Logit-ExpR) performed no better than RMA or
> > dChip.
> >
> > Interestingly, if they DID NOT collapse their probe level data into a
> > single probe_set value, and instead tested across all probes (logit-t),
> > their transformed data did a much better job of winnowing the wheat from
> > the chaff. They concluded that "...the modeling paradigm may cause the
> > loss of information from the probe-level data"..
> >
> > This seems critical to me, there is a huge discrepancy between the
> > significant gene lists generated with different probe level algorithms,
> > and I don't believe we'll be able to understand why that dichotomy exists
> > until we look at the underlying probe level information.
> >
> > I have been pleading with our stats department for over a year (I am just
> > a neuroscientist and I write code like a hippopotamus roller skates)  to
> > employ a 2-way ANOVA on repeated measures at the probe level to test for
> > significance, and in fact went so far as to put the notion (with some
> > sample data) into a book chapter I authored earlier this year (Chapter 6:
> > in A Beginner's Guide to Microarrays).
> >
> > The authors state that "the combination of logit transformation and
> > probe-level statistical testing provides a means for greatly improved
> > PPV...". I would agree, but add the caveat that the comparison, at the
> > probe level, on untransformed values has yet to be done, thus the probe
> > level idea may be more important than the transformation notion to
> > improved PPV. Other methods have looked at the probe level information
> > (e.g., Liu et al. 2002- Affymetrix multiple pairwise comparison- but
> > their use of the feature level data as biological n may be inapropriate;
> > and Zhang et al., 2002- but their intention was only for a two chip
> > comparison).
> >
> > I believe that it is unfortunate that the authors resort to fold change
> > as a final discriminator after all of that hard work, rather than a
> > formal statistical test. I still feel that 2-way ANOVA on repeated
> > measures is the right test for this, but would love to hear from others.
> >
> > -E
> >
> > P.S. My apologies to Lemon et al if I have misrepresented/ misunderstood
> > your work. I will gladly retract/ correct this (or any part of it) at
> > your request.
> >
> > At 12:00 PM 9/25/2003 +0200, you wrote:
> >
> >       Message: 1
> >       Date: Wed, 24 Sep 2003 19:46:04 +0200
> >       From: "Dario Greco" <greco at biogem.it>
> >       Subject: [BioC] ...Logit-t vs RMA...
> >       To: "Bioconductor" <bioconductor at stat.math.ethz.ch>
> >       Message-ID: <002601c382c3$bd4a42a0$ce3ca48c at neo>
> >       Content-Type: text/plain;       charset="us-ascii"
> >
> >       Hi to everybody,
> >       I've just red some days ago the new paper on "logit-t" method
> >       to analyze
> >       affy chips.
> >
> >       --------------------------------------------------------------
> >       "A high performance test of differential gene expression for
> >       oligonucleotide arrays"
> >
> >       William J Lemon, Sandya Liyanarachchi and Ming You
> >
> >       Genome Biology 2003, 4:R67
> >       --------------------------------------------------------------
> >
> >       What do you think about this?
> >
> >       Regards
> >       Dario
> >
> >       --------------------------------------------
> >       Dario Greco
> >       Institute of Genetics and Biophysics
> >       "Adriano Buzzati Traverso" - CNR
> >       111, Via P.Castellino
> >       80131 Naples, Italy
> >       phone +39 081 6132 367
> >       fax  +39 081 6132 350
> >       email: greco at igb.cnr.it; greco at biogem.it
> >
> > Eric Blalock, PhD
> > Dept Pharmacology, UKMC
> > 859 323-8033
> >
> > STATEMENT OF CONFIDENTIALITY
> >
> > The contents of this e-mail message and any attachments are confidential
> > and are intended solely for addressee. The information may also be
> > legally privileged. This transmission is sent in trust, for the sole
> > purpose of delivery to the intended recipient. If you have received this
> > transmission in error, any use, reproduction or dissemination of this
> > transmission is strictly prohibited. If you are not the intended
> > recipient, please immediately notify the sender by reply e-mail or at
> > (859) 323-8033 and delete this message and its attachments, if any.
> >

Eric Blalock, PhD
Dept Pharmacology, UKMC
859 323-8033

STATEMENT OF CONFIDENTIALITY

The contents of this e-mail message and any attachments are confidential 
and are intended solely for addressee. The information may also be legally 
privileged. This transmission is sent in trust, for the sole purpose of 
delivery to the intended recipient. If you have received this transmission 
in error, any use, reproduction or dissemination of this transmission is 
strictly prohibited. If you are not the intended recipient, please 
immediately notify the sender by reply e-mail or at (859) 323-8033 and 
delete this message and its attachments, if any.