[BioC] Hypergeometric test with Disease Ontology
Peter Robinson
peter.robinson at charite.de
Thu Jan 27 16:39:14 CET 2011
On 01/27/2011 04:17 PM, Gilbert Feng wrote:
> Hi, Ted
>
> DO.db is a standard sqlite file and you can use standard RSQLite
> procedure to retrieve the information. Actually we do have a lite
> version of Disease Ontology, DOLite that removes some redundant nodes,
> integrated in GeneAnswers package, which also uses hypergeometric test
> to run enrichment analysis as well as automatically generated
> interactive(cytoscape web support) html summary for one or more groups
> of genes.
>
> Best
>
> Gilbert
You might also want to take a look at this website:
http://django.nubic.northwestern.edu/fundo/faq
which implements enrichment analysis using genes annotated to DO terms.
-peter
>
> On 1/27/11 5:00 AM, Ted Morrow wrote:
>> Dear all,
>>
>> I would like to conduct a hypergeometric test on a list of genes but
>> using Disease Ontology instead of GO or KEGG terms. The package "DO.db"
>> contains this information but I have been unable to find a way of using
>> this database in conjunction with the "GOstats" package that I have been
>> using.
>>
>> Has anyone attempted to run a hypergeometric test for Disease Ontology
>> terms? Is there another package I could use? Or is there a way of
>> modifying the argument (if that's the right word) GOHyperGParams in
>> GOstats so that it can make use of the information in DO.db?
>>
>> Both the GO and KEGG analyses work fine:
>> ### GO chunk
>> params<- new("GOHyperGParams",
>> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
>> annotation = "hgu95av2.db", ontology="BP",
>> pvalueCutoff=0.01, conditional=FALSE,
>> testDirection="over")
>> hgOver<- hyperGTest(params)
>>
>> hgOver
>> Gene to GO BP test for over-representation
>> 2136 GO BP ids tested (15 have p< 0.01)
>> Selected gene set size: 112
>> Gene universe size: 951
>> Annotation package: hgu95av2
>>
>> ### KEGG chunk
>> paramsKEGG<- new("KEGGHyperGParams",
>> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
>> annotation = "hgu95av2.db",
>> pvalueCutoff=0.01,
>> testDirection="over")
>>
>> hgOverKEGG<- hyperGTest(paramsKEGG)
>> hgOverKEGG
>> summary(hgOverKEGG)
>>
>> My data looks like this:
>> str(selectedEntrezIds)
>> chr [1:157] "60528" "6853" "10157" "5081" "389434" "6591" "7414" "7546"
>> "3074" "6916" "6559" "23503" "8626" "6557" "38" "60" "9733" "113235"
>> "28962" "10269" "4069" "30835" "7018" ...
>>
>> > str(entrezUniverse)
>> chr [1:1310] "8813" "3075" "2729" "8379" "204" "170302" "10165" "6521"
>> "799" "3052" "1387" "5244" "3674" "6833" "10083" "60528" "8842" "5048"
>> "4843" "6329" "5080" "6401" "6853" ...
>>
>> My naive attempts to use DO have included:
>> paramsDO<- new("DOHyperGParams",
>> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
>> annotation = "DO.db",
>> pvalueCutoff=0.01,
>> testDirection="over")
>>
>> Which of course doesn't work and gives the following error:
>> Error in getClass(Class, where = topenv(parent.frame())) :
>> "DOHyperGParams" is not a defined class
>>
>> > traceback()
>> 3: stop(gettextf("\"%s\" is not a defined class", Class), domain = NA)
>> 2: getClass(Class, where = topenv(parent.frame()))
>> 1: new("DOHyperGParams", geneIds = selectedEntrezIds, universeGeneIds =
>> entrezUniverse,
>> annotation = "DO.db", pvalueCutoff = 0.01, testDirection = "over")
>>
>>
>> Replacing "GOHyperGParams" with "DOHyperGParams" also gives the
>> following error:..
>>
>> hgOverDO<- hyperGTest(paramsDO)
>> Error in match.arg(ontology, c("BP", "CC", "MF")) :
>> 'arg' should be one of “BP”, “CC”, “MF”
>>
>> traceback()
>> 10: stop(gettextf("'arg' should be one of %s", paste(dQuote(choices),
>> collapse = ", ")), domain = NA)
>> 9: match.arg(ontology, c("BP", "CC", "MF"))
>> 8: getUniverseViaGo(p)
>> 7: universeBuilder(p)
>> 6: universeBuilder(p)
>> 5: .hyperGTestInternal(p)
>> 4: is(object, Cl)
>> 3: is(object, Cl)
>> 2: .valueClassTest(standardGeneric("hyperGTest"), "HyperGResultBase",
>> "hyperGTest")
>> 1: hyperGTest(paramsDO)
>>
>>
>> Any help would be greatly appreciated.
>> /Ted
>>
>> > sessionInfo()
>> R version 2.12.1 (2010-12-16)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
>> States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>> LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] KEGG.db_2.4.5 DO.db_2.1.0 GO.db_2.4.5 hgu95av2.db_2.4.5
>> org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0
>> Category_2.16.0 AnnotationDbi_1.12.0
>> [12] Biobase_2.10.0
>>
>> loaded via a namespace (and not attached):
>> [1] annotate_1.28.0 genefilter_1.32.0 GSEABase_1.12.2 RBGL_1.26.0
>> splines_2.12.1 survival_2.36-2 tools_2.12.1 XML_3.2-0.2 xtable_1.5-6
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> .
>
--
PD Dr. med. Peter N. Robinson, MSc.
Institut für Medizinische Genetik
Charité - Universitätsmedizin Berlin
Augustenburger Platz 1
13353 Berlin
Germany
voice: 49-30-450566042
fax: 49-30-450569915
email: peter.robinson at charite.de
http://compbio.charite.de/
http://www.human-phenotype-ontology.org
More information about the Bioconductor
mailing list