[BioC] GOstat: listing genes from hyperGTest
James W. MacDonald
jmacdon at med.umich.edu
Wed Oct 22 15:10:39 CEST 2008
Hi Tim,
Does probeSetSummary() do what you want?
Best,
Jim
Tim Smith wrote:
>
> Hi,
>
> I
> was performing a hyperGTest for genes in homo-sapiens. For a set of
> input genes, this function returns some 'significant' GO terms. What I
> wanted to now do was to co-relate each significant GO term (returned by
> this function) with genes (from my set of input genes) associated with
> that GO term. However, I think that I may be using the wrong
> package/function to get the releveant set of genes.
>
> Currently, what I'm doing is finding the significant GO terms by using the following code:
>
> -----------------------
> ### 'genes1' are the Entrez IDs of my genes of interest, and 'allGenes' is the universe of Entrez IDs
>
> paramsGO <- new("GOHyperGParams", geneIds = genes1,
> universeGeneIds = allGenes, annotation = "org.Hs.eg.db",
> ontology = "BP", pvalueCutoff = 1, conditional = FALSE,
> testDirection = "over")
>
> GO <- hyperGTest(paramsGO)
> --------------------------
> This
> gives me a set of significant GO terms. Now, I would like to find which
> subset of genes in 'genes1' is associated with each of the significant
> GO term. To do this I map all GO terms to their Entrez IDs using the
> 'org.Hs.eg.db' package using the following:
>
> xx <- as.list(org.Hs.egGO2EG)
>
> to
> get a mapping of GO terms to Entrez IDs. I get 6,756 GO terms (isn't
> this number small?) that map to at least one Entrez ID. So, from here I
> look up which Entrez IDs are associated with my GO term of interest.
>
> My
> problem is that often, the GO term from hyperGTest is not associated
> with any Entrez ID (using xx <- as.list(org.Hs.egGO2EG) described
> above ), i.e. the GO term/ID is not in the list obtained from
> 'org.Hs.egGO2EG'). For example, the term 'GO:0043284' is thrown up by
> hyperGTest, but does not appear to be associated with any Entrez IDs in
> the org.Hs.eg.db package. Where could I be going wrong?
>
> I would give a set of genes so that the example is reproducible, but with hundreds of genes the email will get too long!
>
> Thanks for any comments/suggestions. I realize that I'm probably doing something really stupid here....
>
> My sessionInfo() is:
> --------------------------------
> R version 2.7.2 (2008-08-25)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United
> States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] grid splines tools stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1]
> gplots_2.6.0 gmodels_2.14.1 gtools_2.4.0
> gdata_2.4.1 Rgraphviz_1.18.1 GOstats_2.6.0
> Category_2.6.0
> [8] RBGL_1.16.0 annotate_1.18.0
> xtable_1.5-2 graph_1.18.0 PFAM.db_2.2.0
> GO.db_2.2.0 KEGG.db_2.2.0
> [15] org.Hs.eg.db_2.2.0 AnnotationDbi_1.2.0 RSQLite_0.6-8 DBI_0.2-4 genefilter_1.20.0 survival_2.34-1 affy_1.18.0
> [22] preprocessCore_1.2.0 affyio_1.8.0 Biobase_2.0.0
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.11 MASS_7.2-44
>
>
> ---------------------------------
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662
More information about the Bioconductor
mailing list