[Bioc-devel] arabidopsis annotations
hpages at fhcrc.org
Mon Aug 27 17:20:19 CEST 2007
nli at fhcrc.org wrote:
> Hi, Herve,
> I feel this is more of a data source problem than a data value problem.
In addition to the data source problem, there is a map naming problem.
> reason that we have this inconsistency in ag and ath1121501 is because we
> extract enzyme information from AraCyc rather than from KEGG. KEGG provides
> EC numbers but AraCyc only provides enzyme names. I tried to suggest using KEGG
> instead of AraCyc when I updated AthPkgBuilder last year, but only get half way
> through: we added KEGG pathway annotation to the package but still keep AraCyc
> pathway data (post link:
> ). Maybe you can use a similar solution: add KEGG enzyme annotation and rename
> AraCyc enzyme annotation into a different object.
> I would also like to suggest posting this question on bioc so that you get a
> bigger audience group.
Thanks for the feedback. After reading it we decided to go for solution D. i.e.
to provide both mappings (probes <-> enzyme names and probes <-> EC numbers).
The data currently in the ENZYME map (probes <-> enzyme names) will be moved
to new ARACYCENZYME map and from now the ENZYME map will contain
the "probes <-> EC numbers" mapping.
> hope this helps
> Quoting Herve Pages <hpages at fhcrc.org>:
>> Hi Bioc-developpers,
>> In the process of migrating the arabidopsis annotations to the new
>> infrastructure, we found a problem with the current ENZYME/ENZYME2PROBE
>> We'd like to know what you think (especially if you've been using these
>> In the ag and ath1121501 packages the ENZYME/ENZYME2PROBE maps are linking
>> probe ids
>> to enzyme names, and not to EC numbers like in _all_ other chip-based
>> In addition the man pages for those maps are incorrect: they claim that those
>> 2 maps
>> are between manufacturer ids and EC numbers (not really a surprise in fact
>> AnnBuilder uses the same template as for any other packages to generate the
>> ENZYME/ENZYME2PROBE man pages).
>> This is not a satisfying situation and we'd like to improve things a little
>> bit for the upcoming ag.db and ath1121501.db packages. There are of course
>> ways we could address the problem:
>> A. just fix the man pages:
>> - pro: easy and 100% compatible with the current (environment-based) ag
>> ath1121501 packages
>> - con: for arabidopsis, the ENZYME/ENZYME2PROBE maps will remain
>> from what they are in all other chip-based packages + people
>> want the EC numbers still don't have them
>> B. fix the ENZYME/ENZYME2PROBE maps so that they are consistent with all
>> other ENZYME/ENZYME2PROBE maps
>> - pro: consistency across all other chip-based packages
>> - con: enzyme names are gone so the user code using the
>> ENZYME/ENZYME2PROBE maps
>> from ag and ath1121501 will need to be modified to work with
>> ag.db and
>> C. rename the ENZYME/ENZYME2PROBE maps -> ECNAME/ECNAME2PROBE and deprecate
>> ENZYME/ENZYME2PROBE maps
>> - pro: use the standard deprecation procedure for a smooth transition
>> - con: people that want the EC numbers right now still don't have them
>> need to wait BioC 2.2)
>> D. fix the ENZYME/ENZYME2PROBE maps and add 2 new maps (e.g.
>> for the mapping between probe ids and enzyme names
>> - pro: consistency and completeness
>> - con: the user code using the ENZYME/ENZYME2PROBE maps from ag and
>> will need to use the ECNAME/ECNAME2PROBE maps instead (but here
>> impact on the user is not as bad as with B since the data they
>> have been using so far is still available but under different
>> E. anything else?
>> Thanks for your feedback!
>> Bioc-devel at stat.math.ethz.ch mailing list
More information about the Bioc-devel