[BioC] can I use FDR correction with hyperGTest conditional GO method?

Kevin R. Coombes krc at mdacc.tmc.edu
Mon Feb 12 19:27:27 CET 2007


Hi,

Take a look at
	Gold DL, Coombes KR, Wang J, Mallick B.
	Enrichment analysis in high-throughput genomics--accounting for
	dependency in the NULL.
	Brief Bioinform. 2006 Oct 31; [Epub ahead of print]
when it comes out. (An earlier version is available as aw tech report on 
our web site at http://bioinformatics.mdanderson.org)  We have looked at 
how the model needs to be changed to account for pairwise 
interactions/correlations between categories.  One key point is that the 
relative rankings of the importance of different GO categories does not 
appear to change very much if you improve the model by accounting for 
dependence.

This does not directly address the FDR question you raised.  But it 
suggests that dependence is actually weaker than you might think, so the 
usual FDR assessments might be close to correct.
	Kevin


Mark W Kimpel wrote:
> Here's a question for the serious statisticians amongst us.
> 
> The function hyperGTest of package "GOstats" implements a method similar 
> to Alexa, et. al (2006) (elim method). Alexa, et. al claim that the oft 
> used hypergeometric test on the entire ontology can't be analyzed for 
> FDR because of the highly interdependent nature of the DAG structure of 
> GO. The authors go on claim that their methods decrease this 
> interdependence, but, as far as I can tell, never directly answer the 
> question as to whether the resultant p values can be corrected for FDR.
> 
> For the purpose of the following discussion, assume that we are only 
> working with one of the 3 major GO categories. While it is true that 
> dependence has been decreased because a parent cannot reverse inherit a 
> gene from its child, several children at the same level can share genes, 
> or can they? I"m not sure.
> 
> If there is gene overlap at the lowest levels of the GO graph structure, 
> then it seems to me that there is still dependence and FDR cannot be 
> assessed. Correct?
> 
> if there is no gene overlap at the lowest levels of the GO graph 
> structure, then it seems to me that these levels are independent and FDR 
> can be applied. Correct?
> 
> Would someone who really knows GO answer the question about overlap of 
> genes at the lowest levels and then could a statistician answer the 
> questions regarding dependence/independence and the applicability of 
> applying an FDR method such as BH or the Storey qvalue?
> 
> Thanks,
> 
> Mark
>



More information about the Bioconductor mailing list