[BioC] AnnotationDbi Packages/hyperGTest: Possible to avoid explicit mapping back to ENTREZ IDs?

Robert Gentleman rgentlem at fhcrc.org
Thu May 22 15:38:36 CEST 2008


Hi Johannes,

Johannes Graumann wrote:
> Hi,
> 
> I build an annotation package for a mouse IPI database using "AnnotationDbi"
> (IPI.MOUSE.3.37.20080509.db) and am now putting it to use.
> I have a bunch of IPI-ID clusters ("SubjectiveClusters") each of which I
> want to check for GO terms significantly enriched over the combined set. I
> attached what I do, but was wondering whether ther's ways to do this
> without having to explicitly/manually go via ENTREZ IDs. The annotation
> package contains all information necessary already, no? Is there anything
> more recent than "GOstats" to look at this?

   Did you have a look at the vignette? It does seem to explain how to 
do this (and there are examples, such as the YEAST package which do not 
use EntrezGene IDs).  What is essential is that the geneIds and 
universeGeneIds match and that this is the same type of identifier that 
you made the primary key when you made your annotation package (so that 
given one of these you can find all relevant GO terms, but that is an 
AnnotationDbi question).

> 
> Thanks, Joh
> 
> library(IPI.MOUSE.3.37.20080509.db)
> library("GOstats")
> # Extract all IPI to ENTREZID pairs
> ENTREZIDs <- IPI.MOUSE.3.37.20080509ENTREZID
> # Combine all present IPI IDs to the Universe
> Universe <- unlist(SubjectiveClusters,use.names=FALSE)
> # Purge IPIs not mapping to ENTREZ
> Universe <- Universe[Universe %in% mappedkeys(ENTREZIDs)]
> # Map universe to ENTREZ IDs
> Universe <- unique(unlist(as.list(ENTREZIDs[Universe]),use.names=FALSE))
> SubjectiveClustersHGT <- vector("list",0)
> for(name in names(SubjectiveClusters)){
>   Temp.Cluster <- SubjectiveClusters[[name]]
>   Temp.Cluster <- Temp.Cluster[Temp.Cluster %in% mappedkeys(ENTREZIDs)]
>   Temp.Cluster <- unlist(as.list(ENTREZIDs[Temp.Cluster]),use.names=FALSE)
>   params <- new(
>     "GOHyperGParams", 
>     geneIds = Temp.Cluster,
>     universeGeneIds = Universe,
>     annotation = "IPI.MOUSE.3.37.20080509.db",
>     ontology = "BP",
>     pvalueCutoff = 0.01,
>     conditional = TRUE,
>     testDirection = "over"
>   )
>   cat("HyperGTest for cluster ",name,".\n",sep="")
>   SubjectiveClustersHGT[[name]] <- hyperGTest(params)
> }
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list