[BioC] hgu133plus2 GO issues
Jake
jjmichael at comcast.net
Tue Apr 18 18:49:43 CEST 2006
On Tue, 2006-04-18 at 09:37 -0700, Seth Falcon wrote:
> Hi Jake,
>
> Jake <jjmichael at comcast.net> writes:
> > Could someone please help me understand the differences between the
> > (hgu133plus2)GO, GO2PROBE, GO2ALLPROBES? I've found discepancies that I
> > can't quite explain:
> >
> > > mget("GO:0042611", hgu133plus2GO2PROBE)
> > Error: value for 'GO:0042611' not found
>
> GO annotates probe ids (really Entrez Gene ids) at the most specific
> term in the GO ontology. In the above search of hgu133plus2GO2PROBE,
> you are seeing that GO:0042611 does not have any annotations.
>
>
> >> mget("GO:0042611", hgu133plus2GO2ALLPROBES)
> > $"GO:0042611"
> > <NA> IEA IEA IEA
> > <NA>
> > "209309_at" "217014_s_at" "210325_at" "218831_s_at"
> [snip]
>
> For a given GO term, the hgu133plus2GO2ALLPROBES environment is giving
> you all Affy ids that map to this GO term _or_ a more specific term
> that is related to this term (by related, I mean child-like relation,
> where there is a path in the DAG connecting the terms).
>
> The names on the vector are evidence codes. See the man pages for
> details.
>
> So for the above two cases, this is as expected and I don't think
> there is any inconsistency.
>
> > and finally...
> >
> > ### "208729_x_at" is one of the probes returned with the above command
> >> grep("GO:0042611",unlist(mget("208729_x_at", hgu133plus2GO)))
> > numeric(0)
>
> When you say "above command", which one are you referring to?
> hgu133plus2GO should be the inverse map for hgu133plus2GO2PROBE.
>
> > "208729_x_at" is on the hgu133plus2 chip, but GO and GO2ALLPROBES don't
> > map it to the same GO ID.
>
> Can you be more specific? Which env in the GO package are you talking
> about. Note that GO2ALLPROBES does not map to GO ids, it maps _from_
> GO ids.
>
> You can ask which GO ids have the 208729_x_at annotation using
> hgu133plus2GO.
>
> If you then grep through hgu133plus2GO2ALLPROBES for GO ids that have
> 208729_x_at in their probe vector, then you should find more GO ids
> because you are picking up parent terms that don't have the specific
> annotation. However, all the ids you found in hgu133plus2GO should
> appear.
>
> Clear as mud? :-)
>
> > Is there something wrong here or am I just missing something? If
> > different, which is the most "reliable" mapping? I'm concerned because
> > I went through to validate GO IDs I had gotten from the GOHyperG
> > function (a total of 314), and 117 of those I could not map back to my
> > significant probe list using the hgu133plus2GO annotation. I noticed by
> > looking at the GOHyperG function that it uses information from
> > GO2ALLPROBES.
> >
> > Any help/enlightenment is much appreciated.
> >
> > PS - using R 2.2.1 with hgu133plus2 1.10.0
>
> PS: sessionInfo() would be a better way to report that. Then we would
> also know your version of the GO package, for example.
>
> + seth
Thanks for all the help, guys - really helped my understanding as to how
the GO mappings work in the context of BioC. I had previously assumed
that mappings in all the GO environments were multi-level, and now I
know that really on the GO2ALLPROBES environment is.
Jim- sorry for personally replying to you -meant to send to the list but
I frequently hit "reply" instead of "reply to all" on accident.
--Jake
More information about the Bioconductor
mailing list