[BioC] GOstats gene set size selection

Assaf Oron aoron at fhcrc.org
Sat Apr 19 00:00:30 CEST 2008


Alex hi,

I'm not sure whether you can directly prevent the hyperGTest function 
from testing small gene-sets. It tests all gene-sets as a default.

However, you can certainly filter the list of significant gene-sets by 
set size.

The summary() function for the test output has a "categorySize" argument.

Suppose the output of your test is called "testresults".

Then do: summary(testresults, categorySize=3) to filter out sets of 2 or 
less genes.

My gut feeling is that you won't get significant gene-sets with 2 genes 
anyway, so to see any appreciable change in your results you'll have to 
set the threshold higher. The smallest significant set in the vignette 
example has 7 genes (6 of them marked as significant).

Additionally, the summary function itself returns a data frame, one of 
whose column is "Size", so you can always arrange and filter the 
gene-set list later as well.


Hope this helps,
cheers,
Assaf



More information about the Bioconductor mailing list