[BioC] straight t vs. bonferroni vs. all the new stuff.

Sean Davis sdavis2 at mail.nih.gov
Fri Oct 20 03:58:41 CEST 2006


Matthew Lyon wrote:
> "Oh yes,  I forgot to mention that there is no universally good value 
> to use for your cut-off.  If most of the genes are non-differentially 
> expressing, most of your errors will be false detects.  If most of the 
> genes are differentially expressing, most of your errors will be false "
>
> i totally understand this. do you ever tend see standard values (or 
> magnitudes) associated with things that are known/expected to differ, 
> however, like drug-induced upregulation of certain liver p450s?
Naomi is really not kidding.  There is no easy way out.  One must 
interpret the results not in a vacuum, but with regard to the known 
biology as well as the experimental design and goals.  For example, 
knowing that 5,000 genes are differentially-expressed between a tumor 
and an associated normal tissue with a false discovery rate of 10% is 
perhaps not very meaningful for any one gene and certainly is not 
something that is easy to follow up on on a gene-by-gene basis.  
However, for determining enriched GO categories, a list of 5,000 genes 
will be just fine.  It will almost certainly not include all genes that 
are truly differentially expressed, but it IS 5000 genes--more than 
enough.  In a different biological situation, perhaps comparing children 
and their parents blood for the effects of aging, the gene list at fdr 
of 10% might include only 1 gene, but at 50% includes 12 genes.  In this 
case, having a 50% fdr might be totally acceptable, because each of 
these genes is potentially very valuable and can be validated by a 
second assay or via molecular biology or in a model organism. 

So, although the examples are entirely fictitious, you can see that in 
different situations different degrees of statistical certainty are 
acceptable and, in fact, encouraged.  That isn't to say that there are 
no rules, but you can see that what serves one project well might be 
entirely inappropriate in another when taken in the context of the 
project goals and underlying biology.

Sean



More information about the Bioconductor mailing list