[BioC] False Discovery Rate Questions

Tue Apr 15 14:38:34 CEST 2008

James W. MacDonald wrote:
>
> I think people put way too much stock in multiplicity adjustments. 
Being a statistician myself I probably shouldn't say this, but there is 
some truth in this statement. It seems we statisticians have been quite 
successful  to convince biologists that multiple testing is an issue 
that needs to be addressed. Unfortunately the main message that seems to 
have come across is: "You should never mention a gene in paper or do any 
further research into it, unless its FDR value is below 5%", which 
creates the very understandable frustration Sally describes in her 
e-mail. My advice to biologist in situation like this is

a) Look at the p-value distribution!If its skewed towards small p-values 
the experiment has picked up something, even if it is not strong enough 
to give you many genes with and FDR-adjusted p below 5%.

b) Choose the cut-off depending on what you want to do after the 
microarray experiment. If the experiment is the end of the story and you 
only want to publish the list of most changed genes, you really want to 
be sure, so in that case an FDR below 5% is an appropriate criterion 
(any referee with some sense should criticize you otherwise. If you find 
a lot of papers with lists of unadjusted p-values, this doesn't mean 
that this is "good practice", rather that they "got away with it").
 If however you want to choose candidate genes which you will study in 
follow-up experiments you might very well be willing to accept that 20% 
or 50% of them are false positives.

c) Take biological knowledge into account. For example: if a gene has an 
FDR value of 50% but it was one you expected to change or it has already 
been found to change in other simular studies, than it will obviously 
strengthen your result. A more coordinated way of utilsing biological 
knowledge is not to analyse single genes but gene sets/pathways with one 
of the many gene set analysis tools (GSEA, GlobalTest, cf. also Sean 
Davis's posting "[BioC] combining p-values and independent genes 
stouffer" to this list today.)
A single gene may show some indication of being changed but not enough 
to jump the FDR<5% hurdle, but if many other related genes show a 
similar change the overall result might be highly significant.

As this is an issue I have to discuss a lot with biologist I collaborate 
with, I would be interested to hear how other people on the list see this.

Cheers

Claus

-- 
***********************************************************************************
 Dr Claus-D. Mayer                    | http://www.bioss.ac.uk
 Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk
 Rowett Research Institute            | Telephone: +44 (0) 1224 716652
 Aberdeen AB21 9SB, Scotland, UK.     | Fax: +44 (0) 1224 715349