[BioC] hyperGTest results I do not understand

ariel at df.uba.ar ariel at df.uba.ar
Tue Feb 27 16:58:13 CET 2007


Dear list,
I am learning how results reported by hyperGTest funcion are calculated
but I am getting into trouble with some results I do not understand...
In the following example 'selectedEntrezIds' is a list of 1507 non-duplicated
modulated ENTREZ ids, included in 'entrezUniverse', a list of 3122
non-duplicated ENTREZ ids taken as the universe.

Here is the code:
> hgCutoff <- 0.05
> params <- new("GOHyperGParams",
+              geneIds=selectedEntrezIds,
+              universeGeneIds=entrezUniverse,
+              annotation="hgu133plus2",
+              ontology="BP",
+              pvalueCutoff=hgCutoff,
+              conditional=TRUE,
+              testDirection="over")

> hgOver.BP <- hyperGTest(params)
> summary( hgOver.BP)[1,-7]
           ID      Pvalue OddsRatio ExpCount Count Size
1 GO:0007229 0.002376785  7.049593 13.80089    13   15

For this particular node I think that the corresponding contingency table
can be written as:

          selected  ~selected
gonode    13         2
~gonode 1494      1613

for which ExpCount should be 15*1507/3122 = 7.24, and not 13.8 as is reported.
(The pvalue I am getting is also a little bit different: 0.002428757  
with phyper, 0.002429 with fisher exact test)

For other go nodes I am even getting ExpCount values greater than the  
node size!

What am I missing here?

Thanks
Ariel./



More information about the Bioconductor mailing list