[R] GO & Protein Complex Analysis for Homo sapiens

Martin Morgan mtmorgan at fhcrc.org
Tue Sep 13 15:14:40 CEST 2011

On 09/13/2011 05:21 AM, Sandeep Amberkar wrote:
> Dear All,
> I need to fetch GO ontologies for Homo sapiens with their mappings to
> corresponding Uniprot identifiers. I would be using this information to
> compare result from a clustering algorithm with existing protein complexes.
> This would be a test to check how the clustering algorithm accurately
> captures GO terms with respect to the known protein complexes. Can anyone
> suggest a simple workflow with the requisite packages? I am trying to find
> out to fetch GO ontologies for homo sapiens with bioconductor but most
> packages are designed for enrichment analysis. Am I missing something here?
> Any help would be greatly appreciated.


Ask on the Bioconductor list.


For the annotation part of your question, GO.db represents the GO 
ontologies. org.Hs.eg.db contains information on uniprot mappings. These 
are 'bi-maps' that map from a central identifier (GO id for GO.db; 
Entrez id for *eg.db). So for instance

 > GOTERM[["GO:0000022"]] # [[ to extract single entries
GOID: GO:0000022
Term: mitotic spindle elongation
Ontology: BP
Definition: Lengthening of the distance between poles of the mitotic
Synonym: spindle elongation during mitosis
 > egid <- revmap(org.Hs.egGO)[["GO:0000022"]] # reverse map, extract
 > toTable(org.Hs.egUNIPROT[egid]) # subset map; convert to data.frame
   gene_id uniprot_id
1    9055     O43663
2    9493     Q02241

There are vignettes, e.g., browseVignettes("AnnotationDbi").

To me your analysis sounds like some kind of hypergeometric test. The 
GOstats package is designed to do these, in the context of the GO 
directed acyclic graph.


> Thanks a lot in advance.
> --
> Warm Regards,
> Sandeep Amberkar
> BioQuant,BQ26,
> Im Neuenheimer Feld 267,
> D-69120,Heidelberg
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

More information about the R-help mailing list