[BioC] can I use FDR correction with hyperGTest conditional GO method?

Mon Feb 12 21:39:20 CET 2007

Robert,

I understand exactly what you are saying and I think I have a reasonable 
understanding of what FDR correction does, as least in the "perfect" 
statistical situation where no tests are dependent on one another and 
everything is normally distributed. That's easy, at least statistically.

But, of course, that is not the reality we deal with in statistics of 
microarrays. We know for certain that the expression of many genes are 
dependent on expression of others, heck that's how the whole thing 
works! Most of the time, however, we don't know which genes are 
dependent on which others, that is often the point of our experiments, 
for example, I stimulate this drug with this receptor and I see what 
happens. Given this, it seems that there is a consensus that some FDR 
correction is better than none, so we can select from a number of 
methods depending on whether we want to error on the side of caution or 
want to lessen our FNR at the cost of possibly raising the true FDR 
(which we can never know for sure).

Given that, it is clear that GO analysis, as previously implemented by 
such programs as David, Ease, and GOstats, had way too much dependency 
built into the parent-child relationships to accomodate FDR correction.

Alexa, et. al (2006) seem to contend that their conditional method 
"decorrelates" the graph structure of GO, and this led to to wonder 
whether we could now apply an FDR correction that would be acceptable 
for publication. There is certainly some duplication and dependence, but 
it has been substantially addressed by the conditional method that has 
been incorporated into hyperGTest of package "GOstats".

I feel like it is better than nothing, but I am not a professional 
statistician and would like some guidance and see if there is a 
consensus before I proceed with an analysis for publication.

As an aside, I have written a somewhat time-consuming function that 
iterates over gradually decreasing p-values, calculates the FDR for each 
iteration using the qvalue method of Storey, and yields the parameters 
that give the optimal p value to use to keep FDR < a predetermined 
amount and still maximize the number of categories returned. I don't 
think that is cheating, it is to me the same logic that is applied to 
setting parameters in SAM of package siggenes.

Interesting discussion and one that I hope will provide for some 
continued scholarly debate.

Mark

Robert Gentleman wrote:
> Hi Mark,
>   There has been a fair amount of discussion of these issues already, 
> searching the mailing list will help to reveal the salient points.
> 
>   The most important question here is what do *you* think p-value 
> correction is going to do for you?  In my opinion (and lots of folks 
> seem to have different views), p-value corrections do two things for us.
> Both are related to the observation that under a composite null (all 
> hypotheses are false), that the smallest p-value when testing 10K 
> hypotheses is much smaller than the smallest p-value when testing 5K.
> And most of us need some help deciding/interpreting these outputs.
> 
> 1) If you test some large number of hypotheses, p-value corrections 
> allow you to interpret the p-values in some holistic way. Here one, is 
> trying to answer the question of whether any (or how many) of the 
> hypotheses are truly false. And it sort of works, but basically the 
> "correction", is almost always a reduction in the significance level, 
> and so not only do you enrich the set of "called false" hypotheses for 
> truly false, you also make more errors of the other kind (not rejecting 
> hypotheses that are false).
> 
> 2) If you have two experiments, one with 5K hypotheses, and one with 
> 10K, then p-value corrections allow you to "align" the evidence, and to 
> compare in some sensible way the two experiments.
> 
> I am not aware of any other contributions that these methods can make, 
> but perhaps others will enlighten us.
> 
>  Now, when we turn our attention to GO, the problem is not one of 
> p-value correction, but one of philosophy, again in my view. Consider 
> the following situation (which does often arise).
> 
>  Consider two nodes in the GO graph, with a parent child relationship, 
> and further consider a given set of data, where you have some set of 
> tested genes (which define your universe) and some set of genes you have 
> decided are *special*. Next we find, that for these two nodes in the 
> graph, the same set of genes are annotated at both (for all genes in the 
> organism this will not be true, but we didn't measure them all and we 
> only get to work with what we measured). So now, the two p-values from 
> your Hypergeometric test are identical. No amount of p-value correction 
> (or even p-value psychotherapy) will change that. So which node do you 
> report? This is entirely philosophy and not mathematics.  Current 
> scientific practice is to report the more specific of the nodes, and to 
> only make more general claims (eg I cured cancer, over less general 
> ones, I cured person X, who had cancer) when there is additional 
> evidence, over and above that needed for the specific claim. That is the 
> point of the conditional analyses.
> 
>  Now of course, a much better way to do the whole thing is to use GSEA 
> (eg the Category package), but then you will eventually end up back at 
> the same place. When you are dealing with dependent hypotheses, there is 
> always going to be a philosophical, not just a mathematical, issue to 
> deal with.
> 
>  best wishes
>    Robert
> 
> 
> Mark W Kimpel wrote:
>> Here's a question for the serious statisticians amongst us.
>>
>> The function hyperGTest of package "GOstats" implements a method 
>> similar to Alexa, et. al (2006) (elim method). Alexa, et. al claim 
>> that the oft used hypergeometric test on the entire ontology can't be 
>> analyzed for FDR because of the highly interdependent nature of the 
>> DAG structure of GO. The authors go on claim that their methods 
>> decrease this interdependence, but, as far as I can tell, never 
>> directly answer the question as to whether the resultant p values can 
>> be corrected for FDR.
>>
>> For the purpose of the following discussion, assume that we are only 
>> working with one of the 3 major GO categories. While it is true that 
>> dependence has been decreased because a parent cannot reverse inherit 
>> a gene from its child, several children at the same level can share 
>> genes, or can they? I"m not sure.
>>
>> If there is gene overlap at the lowest levels of the GO graph 
>> structure, then it seems to me that there is still dependence and FDR 
>> cannot be assessed. Correct?
>>
>> if there is no gene overlap at the lowest levels of the GO graph 
>> structure, then it seems to me that these levels are independent and 
>> FDR can be applied. Correct?
>>
>> Would someone who really knows GO answer the question about overlap of 
>> genes at the lowest levels and then could a statistician answer the 
>> questions regarding dependence/independence and the applicability of 
>> applying an FDR method such as BH or the Storey qvalue?
>>
>> Thanks,
>>
>> Mark
>>
> 

-- 
Mark W. Kimpel MD
Neuroinformatics
Department of Psychiatry
Indiana University School of Medicine