[BioC] Why not protein ids for GO resp. GOstats?

Robert Gentleman rgentlem at fhcrc.org
Mon Feb 12 02:11:15 CET 2007



Kai Schlamp wrote:
> Hello, everyone,
> 
> sorry if my quesition is a bit silly, I am new to bioinformtics and 
> Bioconductor.
  Welcome - as with all list-servs there is a substantial body of past 
discussion and some etiquette.  Please read the posting guide, and from 
that you will find that there are searchable mail archives, a good place 
to start from - eg searching on GO and EntrezID will give you a number 
of threads to look at.  In general, when you have a question, spending a 
few minutes searching the discussion list will often turn up the answer.

> Why does GOstats resp. the GO annotation package use entrez gene ids?
> I thought, that GO terms are associated with proteins. And cause one 
> gene can be translated to diffenent proteins, analyzation only with gene 
> ids will perhaps be problematic.

   Do you have a reference for that? AFAIK that is not true and all 
mapping is via Entrez (or equivalent), but if you have some 
documentation that proves otherwise, we would all be happy to see it.


> Especially if the data consists of miRNAs or Proteins and you have to 
> convert it back to gene ids.
> So why does Bioconductor not use the annotation data from the Gene 
> Ontology website. For example in the human annotation file, there are 
> Uniprot and IPI identifiers annotated.

   Because that is not how it works. AFAIK we are not yet at the point 
where anyone can reliable distinguish different protein products in a 
high throughput assay (which is what the annotation packages are 
designed for). And in particular, if one is mapping from transcripts (eg 
microarrays - it is unlikely that there is sufficient resolution to know 
which potential mRNA is there - and even if so that would only cover 
those proteins with SNPs (or similar) in the region probed and would not 
address any post-translational modifications etc.

   Robert


   best wishes
    Robert
> 
> Best regards,
> Kai
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list