[BioC] hgu133plus2 GO issues
James W. MacDonald
jmacdon at med.umich.edu
Tue Apr 18 18:20:31 CEST 2006
Hi Jake,
Jake wrote:
> Hi list,
>
> Could someone please help me understand the differences between the
> (hgu133plus2)GO, GO2PROBE, GO2ALLPROBES? I've found discepancies that I
> can't quite explain:
>
> > mget("GO:0042611", hgu133plus2GO2PROBE)
> Error: value for 'GO:0042611' not found
>
>
>>mget("GO:0042611", hgu133plus2GO2ALLPROBES)
>
> $"GO:0042611"
> <NA> IEA IEA IEA
> <NA>
> "209309_at" "217014_s_at" "210325_at" "218831_s_at"
> "1553402_a_at"
> <NA> <NA> <NA> <NA>
> <NA>
> "206086_x_at" "206087_x_at" "210864_x_at" "211326_x_at"
> "211327_x_at"
> <NA> <NA> <NA> <NA>
> <NA>
> "211328_x_at" "211329_x_at" "211330_s_at" "211331_x_at"
> "211332_x_at"
> <NA> <NA> <NA> IEA
> <NA>
> "211863_x_at" "211866_x_at" "214647_s_at" "235754_at"
> "213932_x_at"
> IEA <NA> <NA> IEA
> <NA>
> "215313_x_at" "208729_x_at" "209140_x_at" "211911_x_at"
> "208812_x_at"
> <NA> <NA> IEA <NA>
> <NA>
> "211799_x_at" "214459_x_at" "216526_x_at" "200904_at"
> "200905_x_at"
> IEA <NA> <NA> IEA
> <NA>
> "217456_x_at" "204806_x_at" "221875_x_at" "221978_at"
> "210514_x_at"
> <NA> <NA> <NA> IEA
> IEA
> "211528_x_at" "211529_x_at" "211530_x_at" "217436_x_at"
> "231748_at"
> <NA> IEA IEA IEA
> "221291_at" "238542_at" "221323_at" "1552777_a_at"
>
> and finally...
>
> ### "208729_x_at" is one of the probes returned with the above command
>
>>grep("GO:0042611",unlist(mget("208729_x_at", hgu133plus2GO)))
>
> numeric(0)
>
>
>
> "208729_x_at" is on the hgu133plus2 chip, but GO and GO2ALLPROBES don't
> map it to the same GO ID.
>
> Is there something wrong here or am I just missing something? If
> different, which is the most "reliable" mapping? I'm concerned because
> I went through to validate GO IDs I had gotten from the GOHyperG
> function (a total of 314), and 117 of those I could not map back to my
> significant probe list using the hgu133plus2GO annotation. I noticed by
> looking at the GOHyperG function that it uses information from
> GO2ALLPROBES.
Here is the difference:
hgu133plus2GO maps Probe IDs to GO terms
hgu133plus2GO2 PROBE maps GO terms to Probe IDs
hgu133plus2GO2ALLPROBES maps GO terms and all children of the terms to
Probe IDs
So there isn't really an issue of reliability here, just an issue of
what you want. In your case, 208729_x_at doesn't map to GO:0042611, but
it does map to children of that GO term (for instance GO:0042612).
sapply(get("208729_x_at", hgu133plus2GO), function(x) x[[1]])
GO:0005624 GO:0005887 GO:0016020 GO:0016021 GO:0019882
GO:0019883
"GO:0005624" "GO:0005887" "GO:0016020" "GO:0016021" "GO:0019882"
"GO:0019883"
GO:0019885 GO:0030106 GO:0030106 GO:0042612
"GO:0019885" "GO:0030106" "GO:0030106" "GO:0042612"
> grep("208729_x_at",get("GO:0042612", hgu133plus2GO2PROBE))
[1] 20
> grep("208729_x_at",get("GO:0042611", hgu133plus2GO2PROBE))
Error in get(x, envir, mode, inherits) : variable "GO:0042611" was not found
> grep("208729_x_at",get("GO:0042611", hgu133plus2GO2ALLPROBES))
[1] 20
HTH,
Jim
>
> Any help/enlightenment is much appreciated.
>
> PS - using R 2.2.1 with hgu133plus2 1.10.0
>
> --Jake
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
More information about the Bioconductor
mailing list