[BioC] straight t vs. bonferroni vs. all the new stuff.

Fri Oct 20 03:21:46 CEST 2006

"Oh yes,  I forgot to mention that there is no universally good value to use 
for your cut-off.  If most of the genes are non-differentially expressing, 
most of your errors will be false detects.  If most of the genes are 
differentially expressing, most of your errors will be false "

i totally understand this. do you ever tend see standard values (or 
magnitudes) associated with things that are known/expected to differ, 
however, like drug-induced upregulation of certain liver p450s?

Thank You,

Matthew Lyon        UC  Riverside                    lab (951) 827-4736
Ph.D. Student        B O T A N Y                new c.p. (951) 941-5554
Citrus Genomics                   apt (951) 328-9930
http: // int - citrusgenomics . org /           messengers: ptrifoliata
mattlyon at mattlyon.com ptrifoliata at hotmail.com  mlyon003 at student.ucr.edu

>From: Naomi Altman <naomi at stat.psu.edu>
>To: Sean Davis <sdavis2 at mail.nih.gov>,Matthew Lyon 
><ptrifoliata at hotmail.com>
>CC: bioconductor at stat.math.ethz.ch
>Subject: Re: [BioC] straight t vs. bonferroni vs. all the new stuff.
>Date: Thu, 19 Oct 2006 21:18:38 -0400
>
>I am trying to understand the issues better, too, but let me give this a 
>try:
>
>Firstly, I think that you must mean that n=n.tests=450,000.
>
>Bonferroni and Holm guard against the probability of one or more errors 
>none of the genes differentially express.
>
>If that is what you want to guard against, then Holm is the method to use 
>for the reason that Sean states.
>
>Most of us would be happy if a large percentage of the genes that we 
>declare to be differentially expressed, really are.  FDR is a set of 
>methods that allow you to compute the expected percentage of mistakes you 
>make if you reject at a certain level.  The way that I use it, is that I 
>look at the q-values and the p-values.  If the percentage of differentially 
>expressing genes is small, I set a q-value (i.e. an acceptable upper limit 
>for FDR) and declare genes with p-value at the corresponding level or less 
>to be significant.  If the percentage of differentially expressing genes is 
>large, I set a p-value for significance, and report the corresponding FDR.
>
>While estimating FDR using the Bioconductor routines, you will probably 
>also estimate the percentage of genes that differentially express.  One 
>thing to note is that to reject the number of hypotheses required to reach 
>that estimated percentage, you will end up having an FDR that is much too 
>high to be acceptable.  So,  once you set a cut-off, you are also almost 
>certain to have a false non-detections as well.
>
>Oh yes,  I forgot to mention that there is no universally good value to use 
>for your cut-off.  If most of the genes are non-differentially expressing, 
>most of your errors will be false detects.  If most of the genes are 
>differentially expressing, most of your errors will be false non-detects.  
>So, there is no value that is good for every data set.
>
>--Naomi
>
>At 02:17 PM 10/19/2006, Sean Davis wrote:
>>Matthew Lyon wrote:
>> > Esteemed List:
>> >
>> > i need an alpha value for a t-test with about n=450,000 and a
>> > 1) df of 2
>> > 2) df of 4
>> >
>> > this is microarray data. i've been told bonferroni is too conservative 
>>for
>> > microarrays, hence interesting approaches like multtest, the q-value
>> > permuted one, etc...
>> >
>> > can anyone who deals in this area extensively (say, expression data) 
>>give me
>> > a ballpark value for t- or alpha- that's typically giving good 'oh man 
>>this
>> > is significantly different!' results ? i've got my own hunches but 
>>would
>> > like some blinded numbers tossed at me too.
>> >
>>Look at the p.adjust() function if you already have p-values computed by
>>a t-test as a place to start.  Bonferroni should probably never be used,
>>as I think the Holm correction has the same assumptions but is less
>>conservative (you get something for nothing...).  Some of the more
>>stats-minded folks might be able to ellaborate on that particular point,
>>but Holm is probably also too conservative.
>>
>>Sean
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>Search the archives: 
>>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>Naomi S. Altman                                814-865-3791 (voice)
>Associate Professor
>Dept. of Statistics                              814-863-7114 (fax)
>Penn State University                         814-865-1348 (Statistics)
>University Park, PA 16802-2111
>