[BioC] can I use FDR correction with hyperGTest conditional GO method?

Mon Feb 12 05:43:47 CET 2007

Here's a question for the serious statisticians amongst us.

The function hyperGTest of package "GOstats" implements a method similar 
to Alexa, et. al (2006) (elim method). Alexa, et. al claim that the oft 
used hypergeometric test on the entire ontology can't be analyzed for 
FDR because of the highly interdependent nature of the DAG structure of 
GO. The authors go on claim that their methods decrease this 
interdependence, but, as far as I can tell, never directly answer the 
question as to whether the resultant p values can be corrected for FDR.

For the purpose of the following discussion, assume that we are only 
working with one of the 3 major GO categories. While it is true that 
dependence has been decreased because a parent cannot reverse inherit a 
gene from its child, several children at the same level can share genes, 
or can they? I"m not sure.

If there is gene overlap at the lowest levels of the GO graph structure, 
then it seems to me that there is still dependence and FDR cannot be 
assessed. Correct?

if there is no gene overlap at the lowest levels of the GO graph 
structure, then it seems to me that these levels are independent and FDR 
can be applied. Correct?

Would someone who really knows GO answer the question about overlap of 
genes at the lowest levels and then could a statistician answer the 
questions regarding dependence/independence and the applicability of 
applying an FDR method such as BH or the Storey qvalue?

Thanks,

Mark

-- 
Mark W. Kimpel MD
Neuroinformatics
Department of Psychiatry
Indiana University School of Medicine