[BioC] hgu133plus2 GO issues

Jake jjmichael at comcast.net
Tue Apr 18 18:49:43 CEST 2006


On Tue, 2006-04-18 at 09:37 -0700, Seth Falcon wrote:
> Hi Jake,
> 
> Jake <jjmichael at comcast.net> writes:
> > Could someone please help me understand the differences between the
> > (hgu133plus2)GO, GO2PROBE, GO2ALLPROBES?  I've found discepancies that I
> > can't quite explain:
> >
> >  > mget("GO:0042611", hgu133plus2GO2PROBE)
> > Error: value for 'GO:0042611' not found
> 
> GO annotates probe ids (really Entrez Gene ids) at the most specific
> term in the GO ontology.  In the above search of hgu133plus2GO2PROBE,
> you are seeing that GO:0042611 does not have any annotations.
> 
> 
> >> mget("GO:0042611", hgu133plus2GO2ALLPROBES)
> > $"GO:0042611"
> >           <NA>            IEA            IEA            IEA
> > <NA>
> >    "209309_at"  "217014_s_at"    "210325_at"  "218831_s_at"
> [snip]
> 
> For a given GO term, the hgu133plus2GO2ALLPROBES environment is giving
> you all Affy ids that map to this GO term _or_ a more specific term
> that is related to this term (by related, I mean child-like relation,
> where there is a path in the DAG connecting the terms).
> 
> The names on the vector are evidence codes.  See the man pages for
> details.
> 
> So for the above two cases, this is as expected and I don't think
> there is any inconsistency.  
> 
> > and finally...
> >
> > ### "208729_x_at" is one of the probes returned with the above command
> >> grep("GO:0042611",unlist(mget("208729_x_at", hgu133plus2GO)))
> > numeric(0)
> 
> When you say "above command", which one are you referring to?
> hgu133plus2GO should be the inverse map for hgu133plus2GO2PROBE.  
> 
> > "208729_x_at" is on the hgu133plus2 chip, but GO and GO2ALLPROBES don't
> > map it to the same GO ID.
> 
> Can you be more specific?  Which env in the GO package are you talking
> about.  Note that GO2ALLPROBES does not map to GO ids, it maps _from_
> GO ids.
> 
> You can ask which GO ids have the 208729_x_at annotation using
> hgu133plus2GO.
> 
> If you then grep through hgu133plus2GO2ALLPROBES for GO ids that have
> 208729_x_at in their probe vector, then you should find more GO ids
> because you are picking up parent terms that don't have the specific
> annotation.  However, all the ids you found in hgu133plus2GO should
> appear.
> 
> Clear as mud? :-)
> 
> > Is there something wrong here or am I just missing something?  If
> > different, which is the most "reliable" mapping?  I'm concerned because
> > I went through to validate GO IDs I had gotten from the GOHyperG
> > function (a total of 314), and 117 of those I could not map back to my
> > significant probe list using the hgu133plus2GO annotation.  I noticed by
> > looking at the GOHyperG function that it uses information from
> > GO2ALLPROBES.
> >
> > Any help/enlightenment is much appreciated.
> >
> > PS - using R 2.2.1 with hgu133plus2 1.10.0
> 
> PS: sessionInfo() would be a better way to report that.  Then we would
> also know your version of the GO package, for example.
> 
> + seth

Thanks for all the help, guys - really helped my understanding as to how
the GO mappings work in the context of BioC.  I had previously assumed
that mappings in all the GO environments were multi-level, and now I
know that really on the GO2ALLPROBES environment is.

Jim- sorry for personally replying to you -meant to send to the list but
I frequently hit "reply" instead of "reply to all" on accident.

--Jake



More information about the Bioconductor mailing list