[BioC] Multiple test question in micrarray- FDR

Sat Dec 13 22:01:11 CET 2008

Thanks, Sean,
Your explanation makes sense to me. Is there any instruction for how to 
search this mailing list to read all the discussions about this topic as 
you say there are MANY discussions of this topic there?

Wayne
--

Sean Davis wrote:
> On Sat, Dec 13, 2008 at 12:36 PM, Wayne Xu <wxu at msi.umn.edu> wrote:
>   
>> Hello,
>> I am not sure this is a right place to ask this question, but it is about
>> micrarray data analysis:
>>
>> In two group t test, the multiple test Q values are depending on the total
>> number of genes in the test. If I filter the gene list first, for example, I
>> only use those genes that have1.2 fold changes for T test and multiple test,
>> this gene list is much smaller than the total gene list, then the multiple
>> test q values are much smaller.
>>
>> Do you think above is a correct way? People who do not do that way may
>> consider the statistical power may be lost? But how much power lost and how
>> to calculate the power in this case?
>>     
>
> No, you cannot filter based on fold change.  However, you can filter
> based on variance or some other measure that does not depend on the
> two groups being compared.  Anything that filters genes based on
> "knowing" the two groups will lead to a biased test.  Remember that
> filtering removes genes from consideration from further analysis.
>
> For further details, there are MANY discussions of this topic in the
> mailing list.
>
>   
>> When people report multiple test Q values, they usually do not mention how
>> many genes are used in this multiple test. You can get different Q values
>> (even use the same method, e.g. Benjamin and Holm adjust method) in the same
>> dataset. Then how can it make sense if the same genes have different Q
>> values?
>>     
>
> A good manuscript should describe in detail the preprocessing and
> filtering steps, the statistical tests used, and the methods for
> correcting for multiple testing.  You are correct that many papers do
> not do so.
>
> As for different q-values in the same dataset using different methods,
> it is important to note that one should not do an analysis, get a
> result, and then, based on that result, go back and redo the analysis
> with different parameters to get a "better" result.  It is very
> important that each step of an analysis (preprocessing, filtering,
> testing, multiple-testing correction) be justifiable independent of
> the other steps in order for the results to be interpretable.
>
> Sean
>