[BioC] Hypergeometric test with Disease Ontology
Ted Morrow
ted.morrow at ebc.uu.se
Thu Jan 27 12:00:18 CET 2011
Dear all,
I would like to conduct a hypergeometric test on a list of genes but
using Disease Ontology instead of GO or KEGG terms. The package "DO.db"
contains this information but I have been unable to find a way of using
this database in conjunction with the "GOstats" package that I have been
using.
Has anyone attempted to run a hypergeometric test for Disease Ontology
terms? Is there another package I could use? Or is there a way of
modifying the argument (if that's the right word) GOHyperGParams in
GOstats so that it can make use of the information in DO.db?
Both the GO and KEGG analyses work fine:
### GO chunk
params <- new("GOHyperGParams",
geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
annotation = "hgu95av2.db", ontology="BP",
pvalueCutoff=0.01, conditional=FALSE,
testDirection="over")
hgOver <- hyperGTest(params)
hgOver
Gene to GO BP test for over-representation
2136 GO BP ids tested (15 have p < 0.01)
Selected gene set size: 112
Gene universe size: 951
Annotation package: hgu95av2
### KEGG chunk
paramsKEGG <- new("KEGGHyperGParams",
geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
annotation = "hgu95av2.db",
pvalueCutoff=0.01,
testDirection="over")
hgOverKEGG <- hyperGTest(paramsKEGG)
hgOverKEGG
summary(hgOverKEGG)
My data looks like this:
str(selectedEntrezIds)
chr [1:157] "60528" "6853" "10157" "5081" "389434" "6591" "7414" "7546"
"3074" "6916" "6559" "23503" "8626" "6557" "38" "60" "9733" "113235"
"28962" "10269" "4069" "30835" "7018" ...
> str(entrezUniverse)
chr [1:1310] "8813" "3075" "2729" "8379" "204" "170302" "10165" "6521"
"799" "3052" "1387" "5244" "3674" "6833" "10083" "60528" "8842" "5048"
"4843" "6329" "5080" "6401" "6853" ...
My naive attempts to use DO have included:
paramsDO <- new("DOHyperGParams",
geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
annotation = "DO.db",
pvalueCutoff=0.01,
testDirection="over")
Which of course doesn't work and gives the following error:
Error in getClass(Class, where = topenv(parent.frame())) :
"DOHyperGParams" is not a defined class
> traceback()
3: stop(gettextf("\"%s\" is not a defined class", Class), domain = NA)
2: getClass(Class, where = topenv(parent.frame()))
1: new("DOHyperGParams", geneIds = selectedEntrezIds, universeGeneIds =
entrezUniverse,
annotation = "DO.db", pvalueCutoff = 0.01, testDirection = "over")
Replacing "GOHyperGParams" with "DOHyperGParams" also gives the
following error:..
hgOverDO <- hyperGTest(paramsDO)
Error in match.arg(ontology, c("BP", "CC", "MF")) :
'arg' should be one of “BP”, “CC”, “MF”
traceback()
10: stop(gettextf("'arg' should be one of %s", paste(dQuote(choices),
collapse = ", ")), domain = NA)
9: match.arg(ontology, c("BP", "CC", "MF"))
8: getUniverseViaGo(p)
7: universeBuilder(p)
6: universeBuilder(p)
5: .hyperGTestInternal(p)
4: is(object, Cl)
3: is(object, Cl)
2: .valueClassTest(standardGeneric("hyperGTest"), "HyperGResultBase",
"hyperGTest")
1: hyperGTest(paramsDO)
Any help would be greatly appreciated.
/Ted
> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] KEGG.db_2.4.5 DO.db_2.1.0 GO.db_2.4.5 hgu95av2.db_2.4.5
org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0
Category_2.16.0 AnnotationDbi_1.12.0
[12] Biobase_2.10.0
loaded via a namespace (and not attached):
[1] annotate_1.28.0 genefilter_1.32.0 GSEABase_1.12.2 RBGL_1.26.0
splines_2.12.1 survival_2.36-2 tools_2.12.1 XML_3.2-0.2 xtable_1.5-6
--
__________________________________________________________________________________
Edward H. Morrow
Department of Ecology and Evolution
Animal Ecology (zooekologi)
Evolutionary Biology Centre
Uppsala University
Norbyvägen 18-D
SE-752 36 Uppsala
SWEDEN
Email: ted.morrow at ebc.uu.se
Tel: +46 18 471 2676
Fax +46 18 471 6484
Webpage: http://www.iee.uu.se/zooekol/default.php?type=personalpage&id=119&lang=en
More information about the Bioconductor
mailing list