[BioC] Hypergeometric test with Disease Ontology
Gilbert Feng
g-feng at northwestern.edu
Thu Jan 27 16:46:29 CET 2011
Thanks, Peter. Yes, Fundo is also a good choice. It also uses
hypergeometric test, but it can show the GeneRef evidence for the
association between diseases and genes.
Gilbert
On 1/27/11 9:39 AM, Peter Robinson wrote:
> On 01/27/2011 04:17 PM, Gilbert Feng wrote:
>> Hi, Ted
>>
>> DO.db is a standard sqlite file and you can use standard RSQLite
>> procedure to retrieve the information. Actually we do have a lite
>> version of Disease Ontology, DOLite that removes some redundant nodes,
>> integrated in GeneAnswers package, which also uses hypergeometric test
>> to run enrichment analysis as well as automatically generated
>> interactive(cytoscape web support) html summary for one or more groups
>> of genes.
>>
>> Best
>>
>> Gilbert
>
>
>
>
> You might also want to take a look at this website:
> http://django.nubic.northwestern.edu/fundo/faq
> which implements enrichment analysis using genes annotated to DO terms.
> -peter
>
>
>
>
>
>
>
>
>
>
>
>>
>> On 1/27/11 5:00 AM, Ted Morrow wrote:
>>> Dear all,
>>>
>>> I would like to conduct a hypergeometric test on a list of genes but
>>> using Disease Ontology instead of GO or KEGG terms. The package "DO.db"
>>> contains this information but I have been unable to find a way of using
>>> this database in conjunction with the "GOstats" package that I have been
>>> using.
>>>
>>> Has anyone attempted to run a hypergeometric test for Disease Ontology
>>> terms? Is there another package I could use? Or is there a way of
>>> modifying the argument (if that's the right word) GOHyperGParams in
>>> GOstats so that it can make use of the information in DO.db?
>>>
>>> Both the GO and KEGG analyses work fine:
>>> ### GO chunk
>>> params<- new("GOHyperGParams",
>>> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
>>> annotation = "hgu95av2.db", ontology="BP",
>>> pvalueCutoff=0.01, conditional=FALSE,
>>> testDirection="over")
>>> hgOver<- hyperGTest(params)
>>>
>>> hgOver
>>> Gene to GO BP test for over-representation
>>> 2136 GO BP ids tested (15 have p< 0.01)
>>> Selected gene set size: 112
>>> Gene universe size: 951
>>> Annotation package: hgu95av2
>>>
>>> ### KEGG chunk
>>> paramsKEGG<- new("KEGGHyperGParams",
>>> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
>>> annotation = "hgu95av2.db",
>>> pvalueCutoff=0.01,
>>> testDirection="over")
>>>
>>> hgOverKEGG<- hyperGTest(paramsKEGG)
>>> hgOverKEGG
>>> summary(hgOverKEGG)
>>>
>>> My data looks like this:
>>> str(selectedEntrezIds)
>>> chr [1:157] "60528" "6853" "10157" "5081" "389434" "6591" "7414" "7546"
>>> "3074" "6916" "6559" "23503" "8626" "6557" "38" "60" "9733" "113235"
>>> "28962" "10269" "4069" "30835" "7018" ...
>>>
>>> > str(entrezUniverse)
>>> chr [1:1310] "8813" "3075" "2729" "8379" "204" "170302" "10165" "6521"
>>> "799" "3052" "1387" "5244" "3674" "6833" "10083" "60528" "8842" "5048"
>>> "4843" "6329" "5080" "6401" "6853" ...
>>>
>>> My naive attempts to use DO have included:
>>> paramsDO<- new("DOHyperGParams",
>>> geneIds = selectedEntrezIds, universeGeneIds = entrezUniverse,
>>> annotation = "DO.db",
>>> pvalueCutoff=0.01,
>>> testDirection="over")
>>>
>>> Which of course doesn't work and gives the following error:
>>> Error in getClass(Class, where = topenv(parent.frame())) :
>>> "DOHyperGParams" is not a defined class
>>>
>>> > traceback()
>>> 3: stop(gettextf("\"%s\" is not a defined class", Class), domain = NA)
>>> 2: getClass(Class, where = topenv(parent.frame()))
>>> 1: new("DOHyperGParams", geneIds = selectedEntrezIds, universeGeneIds =
>>> entrezUniverse,
>>> annotation = "DO.db", pvalueCutoff = 0.01, testDirection = "over")
>>>
>>>
>>> Replacing "GOHyperGParams" with "DOHyperGParams" also gives the
>>> following error:..
>>>
>>> hgOverDO<- hyperGTest(paramsDO)
>>> Error in match.arg(ontology, c("BP", "CC", "MF")) :
>>> 'arg' should be one of “BP”, “CC”, “MF”
>>>
>>> traceback()
>>> 10: stop(gettextf("'arg' should be one of %s", paste(dQuote(choices),
>>> collapse = ", ")), domain = NA)
>>> 9: match.arg(ontology, c("BP", "CC", "MF"))
>>> 8: getUniverseViaGo(p)
>>> 7: universeBuilder(p)
>>> 6: universeBuilder(p)
>>> 5: .hyperGTestInternal(p)
>>> 4: is(object, Cl)
>>> 3: is(object, Cl)
>>> 2: .valueClassTest(standardGeneric("hyperGTest"), "HyperGResultBase",
>>> "hyperGTest")
>>> 1: hyperGTest(paramsDO)
>>>
>>>
>>> Any help would be greatly appreciated.
>>> /Ted
>>>
>>> > sessionInfo()
>>> R version 2.12.1 (2010-12-16)
>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>
>>> locale:
>>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
>>> States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>> LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] KEGG.db_2.4.5 DO.db_2.1.0 GO.db_2.4.5 hgu95av2.db_2.4.5
>>> org.Hs.eg.db_2.4.6 GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0
>>> Category_2.16.0 AnnotationDbi_1.12.0
>>> [12] Biobase_2.10.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] annotate_1.28.0 genefilter_1.32.0 GSEABase_1.12.2 RBGL_1.26.0
>>> splines_2.12.1 survival_2.36-2 tools_2.12.1 XML_3.2-0.2 xtable_1.5-6
>>>
>>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> .
>>
>
>
--
-----------------------------------------------
Gang (Gilbert) Feng, PhD
Biomedical Informatics Center
Robert H. Lurie Comprehensive Cancer Center
Northwestern University
750 N. Lake Shore Drive, 11th Floor(11-175e)
Chicago, IL 60611
Phone:312-503-2358
Email g-feng (at) northwestern.edu
More information about the Bioconductor
mailing list