[BioC] data inconsistence in Bionconductor
Marc Carlson
mcarlson at fhcrc.org
Wed Sep 9 20:45:44 CEST 2009
Hi Lin,
Thanks for pointing this out. The reason for this discrepancy is that
probe "212627_s_at" is actually affiliated with multiple different genes
in the original Affy source file. One of the affiliations is due to the
genbank accession in the affy file and the other is from the entrez gene
ID in that same file (which you already found). Most probes are only
mapped to one thing, but as you discovered, there are a few exceptions.
And sometimes which gene a probe maps to can even change from one
release to another as we learn more and NCBI updates their data sources.
In recent builds of these packages (in the development branch) we have
much improved the representation for probes that map to multiple
different genes. In the development branch, there is now a method
called toggleProbes() which can be used to switch between hiding or
exposing IDs depending on how many genes they map to. If you desire, I
would encourage you to download the devel packages and let us know what
you think. In a month or so these enhanced packages will become the new
release packages.
Marc
Lin Yang wrote:
> Hello, I have this problem using Bioconductor when I am trying to
> analyze gene chip data published by Affymatrix . The gene chip I am
> analyzing is "HG-U133-PLUS-2 ARRAY", and for probe set "212627_s_at" the
> corresponding EntrezID is "23016" according to Affymatrix. However, in the
> package "hgu133plus2.db", I found it to be "7123". It seems that there is an
> inconsistence problem between Affy's data and Bionconductor's data. Thank
> you advance for your help!
>
>
>
> Yours
>
More information about the Bioconductor
mailing list