[BioC] False Discovery Rate Questions
Claus-Dieter Mayer
claus at bioss.ac.uk
Tue Apr 15 14:38:34 CEST 2008
James W. MacDonald wrote:
>
> I think people put way too much stock in multiplicity adjustments.
Being a statistician myself I probably shouldn't say this, but there is
some truth in this statement. It seems we statisticians have been quite
successful to convince biologists that multiple testing is an issue
that needs to be addressed. Unfortunately the main message that seems to
have come across is: "You should never mention a gene in paper or do any
further research into it, unless its FDR value is below 5%", which
creates the very understandable frustration Sally describes in her
e-mail. My advice to biologist in situation like this is
a) Look at the p-value distribution!If its skewed towards small p-values
the experiment has picked up something, even if it is not strong enough
to give you many genes with and FDR-adjusted p below 5%.
b) Choose the cut-off depending on what you want to do after the
microarray experiment. If the experiment is the end of the story and you
only want to publish the list of most changed genes, you really want to
be sure, so in that case an FDR below 5% is an appropriate criterion
(any referee with some sense should criticize you otherwise. If you find
a lot of papers with lists of unadjusted p-values, this doesn't mean
that this is "good practice", rather that they "got away with it").
If however you want to choose candidate genes which you will study in
follow-up experiments you might very well be willing to accept that 20%
or 50% of them are false positives.
c) Take biological knowledge into account. For example: if a gene has an
FDR value of 50% but it was one you expected to change or it has already
been found to change in other simular studies, than it will obviously
strengthen your result. A more coordinated way of utilsing biological
knowledge is not to analyse single genes but gene sets/pathways with one
of the many gene set analysis tools (GSEA, GlobalTest, cf. also Sean
Davis's posting "[BioC] combining p-values and independent genes
stouffer" to this list today.)
A single gene may show some indication of being changed but not enough
to jump the FDR<5% hurdle, but if many other related genes show a
similar change the overall result might be highly significant.
As this is an issue I have to discuss a lot with biologist I collaborate
with, I would be interested to hear how other people on the list see this.
Cheers
Claus
--
***********************************************************************************
Dr Claus-D. Mayer | http://www.bioss.ac.uk
Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk
Rowett Research Institute | Telephone: +44 (0) 1224 716652
Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349
More information about the Bioconductor
mailing list