[R] A question regarding hypergeometric test

rad mac alaextractor1 at gmail.com
Sat Jul 28 16:06:03 CEST 2012


Dear all,

There is a simple question regarding gene set enrichment analysis.
Say, we have a simple denominator and numerator, therefore
hypergeometric test looks like:

p=phyper(white-1,total white,total black,drawn).

However, there is a question regarding database size. Say, my
denominator (total genes on array) is equal to 10000. However,
database (say GO database) harbor only 8000 from this 10000. The
question is should I subtract genes from all values in phyper that do
not fall into the database? By other words:

original function ie: phyper(50,200,9800,500).

subtract genes that didn't fall into database for example:
phyper(50,180,7700,400).

Should I correct my gene lists with database records? Which way is correct?

Thank you in advance for the replies.



More information about the R-help mailing list