[BioC] TopGO p-values VS DAVID p-values

Paul Geeleher paulgeeleher at gmail.com
Thu Feb 2 16:16:46 CET 2012


Hi,

I'm trying to reproduce results I got from DAVID
(http://david.abcc.ncifcrf.gov/) using topGO. For the GO MF category,
the top 10 terms I get from topGO are basically the same as DAVID but
the p-values are drastically different.

I'm using a named factor (intGenes) to define my "interesting" and
"uninteresting" gene symbols and as my microarray doesn't have an
annotation I'm using "org.Hs.eg.db", then my code is as follows:

GOdata <- new("topGOdata", ontology = "MF", allGenes = intGenes,
annot = annFUN.org, mapping = "org.Hs.eg.db", ID="symbol",
nodeSize=20)
test.stat <- new("classicCount", testStatistic = GOFisherTest, name =
"Fisher test")
resultFisher <- getSigGroups(GOdata, test.stat)
allRes <- GenTable(GOdata, classic = resultFisher, orderBy =
"classic", ranksOf = "classic", topNodes = 10)

My top result in DAVID has a fisher exact p-value of .006, but in
"allRes" the same term (Term=sequence-specific DNA binding,
annotated=518, significant=41, expected=18.92) only has a p-value of
.31 (in the column labeled "classic").

Manually plugging the numbers into fishers.test() it looks like DAVID
is much closer to what the p-value should be. Can anyone point me to
where I've gone wrong with topGO cause I think I've followed the
instructions correctly!?

Thanks,

Paul.

-- 
Paul Geeleher (PhD Student)
School of Mathematics, Statistics and Applied Mathematics
National University of Ireland
Galway
Ireland
--
www.bioinformaticstutorials.com



More information about the Bioconductor mailing list