[BioC] understanding GOstats p-value

Janet Young jayoung at fhcrc.org
Sat Jan 5 03:35:29 CET 2008


I have a fairly naive question - I want to make sure I can more or  
less understand the p-values that GOstats hyperGTest comes out with.   
Am I right in thinking the p-value is for enrichment of each category  
individually (i.e. NOT corrected for multiple testing)?

I'm analyzing array CGH data so I am testing a lot of categories (my  
universe is all human genes that have a chromosome position, GO  
category and entrez ID).  Below is an example result - my  
interpretation is that I shouldn't get super-excited about finding 3  
categories with p<0.001 if I've tested 2261 categories (would expect  
about 2 false positives).   Have I understood that correctly?

 > hgCondOver
Gene to GO BP Conditional test for over-representation
2261 GO BP ids tested (3 have p < 0.001)
Selected gene set size: 1433
     Gene universe size: 12325
     Annotation package: org.Hs.eg.db
 >  summary(hgCondOver)
                GOBPID       Pvalue OddsRatio  ExpCount Count Size
GO:0007156 GO:0007156 0.0001330755  2.470839 12.905720    27  111
GO:0001894 GO:0001894 0.0007587546  5.553301  2.209087     8   19
GO:0007600 GO:0007600 0.0009353695  1.446591 74.062556   100  637
GO:0007156 homophilic cell adhesion
GO:0001894       tissue homeostasis
GO:0007600       sensory perception

thanks very much,

Janet Young


Dr. Janet Young (Trask lab)

Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org


More information about the Bioconductor mailing list