[BioC] AnnotationDbi Packages/hyperGTest: Possible to avoid explicit mapping back to ENTREZ IDs?
Robert Gentleman
rgentlem at fhcrc.org
Thu May 22 15:38:36 CEST 2008
Hi Johannes,
Johannes Graumann wrote:
> Hi,
>
> I build an annotation package for a mouse IPI database using "AnnotationDbi"
> (IPI.MOUSE.3.37.20080509.db) and am now putting it to use.
> I have a bunch of IPI-ID clusters ("SubjectiveClusters") each of which I
> want to check for GO terms significantly enriched over the combined set. I
> attached what I do, but was wondering whether ther's ways to do this
> without having to explicitly/manually go via ENTREZ IDs. The annotation
> package contains all information necessary already, no? Is there anything
> more recent than "GOstats" to look at this?
Did you have a look at the vignette? It does seem to explain how to
do this (and there are examples, such as the YEAST package which do not
use EntrezGene IDs). What is essential is that the geneIds and
universeGeneIds match and that this is the same type of identifier that
you made the primary key when you made your annotation package (so that
given one of these you can find all relevant GO terms, but that is an
AnnotationDbi question).
>
> Thanks, Joh
>
> library(IPI.MOUSE.3.37.20080509.db)
> library("GOstats")
> # Extract all IPI to ENTREZID pairs
> ENTREZIDs <- IPI.MOUSE.3.37.20080509ENTREZID
> # Combine all present IPI IDs to the Universe
> Universe <- unlist(SubjectiveClusters,use.names=FALSE)
> # Purge IPIs not mapping to ENTREZ
> Universe <- Universe[Universe %in% mappedkeys(ENTREZIDs)]
> # Map universe to ENTREZ IDs
> Universe <- unique(unlist(as.list(ENTREZIDs[Universe]),use.names=FALSE))
> SubjectiveClustersHGT <- vector("list",0)
> for(name in names(SubjectiveClusters)){
> Temp.Cluster <- SubjectiveClusters[[name]]
> Temp.Cluster <- Temp.Cluster[Temp.Cluster %in% mappedkeys(ENTREZIDs)]
> Temp.Cluster <- unlist(as.list(ENTREZIDs[Temp.Cluster]),use.names=FALSE)
> params <- new(
> "GOHyperGParams",
> geneIds = Temp.Cluster,
> universeGeneIds = Universe,
> annotation = "IPI.MOUSE.3.37.20080509.db",
> ontology = "BP",
> pvalueCutoff = 0.01,
> conditional = TRUE,
> testDirection = "over"
> )
> cat("HyperGTest for cluster ",name,".\n",sep="")
> SubjectiveClustersHGT[[name]] <- hyperGTest(params)
> }
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org
More information about the Bioconductor
mailing list