[Bioc-devel] arabidopsis annotations

Herve Pages hpages at fhcrc.org
Mon Aug 27 17:20:19 CEST 2007


Hi Nianhua,

nli at fhcrc.org wrote:
> Hi, Herve,
> 
> I feel this is more of a data source problem than a data value problem.

In addition to the data source problem, there is a map naming problem.

> The
> reason that we have this inconsistency in ag and ath1121501 is because we
> extract enzyme information from AraCyc rather than from KEGG. KEGG provides
> EC numbers but AraCyc only provides enzyme names. I tried to suggest using KEGG
> instead of AraCyc when I updated AthPkgBuilder last year, but only get half way
> through: we added KEGG pathway annotation to the package but still keep AraCyc
> pathway data (post link:
> http://article.gmane.org/gmane.science.biology.informatics.conductor/9527/match=arabidopsis
> ). Maybe you can use a similar solution: add KEGG enzyme annotation and rename
> AraCyc enzyme annotation into a different object. 
> 
> I would also like to suggest posting this question on bioc so that you get a
> bigger audience group. 

Thanks for the feedback. After reading it we decided to go for solution D. i.e.
to provide both mappings (probes <-> enzyme names and probes <-> EC numbers).
The data currently in the ENZYME map (probes <-> enzyme names) will be moved
to new ARACYCENZYME map and from now the ENZYME map will contain
the "probes <-> EC numbers" mapping.

Cheers,
H.


> 
> hope this helps
> 
> nianhua
> 
> Quoting Herve Pages <hpages at fhcrc.org>:
> 
>> Hi Bioc-developpers,
>>
>> In the process of migrating the arabidopsis annotations to the new
>> sqlite-based
>> infrastructure, we found a problem with the current ENZYME/ENZYME2PROBE
>> maps.
>> We'd like to know what you think (especially if you've been using these
>> maps).
>>
>> In the ag and ath1121501 packages the ENZYME/ENZYME2PROBE maps are linking
>> probe ids
>> to enzyme names, and not to EC numbers like in _all_ other chip-based
>> packages.
>> In addition the man pages for those maps are incorrect: they claim that those
>> 2 maps
>> are between manufacturer ids and EC numbers (not really a surprise in fact
>> because
>> AnnBuilder uses the same template as for any other packages to generate the
>> ENZYME/ENZYME2PROBE man pages).
>>
>> This is not a satisfying situation and we'd like to improve things a little
>> bit for the upcoming ag.db and ath1121501.db packages. There are of course
>> different
>> ways we could address the problem:
>>
>>   A. just fix the man pages:
>>      - pro: easy and 100% compatible with the current (environment-based) ag
>> and
>>             ath1121501 packages
>>      - con: for arabidopsis, the ENZYME/ENZYME2PROBE maps will remain
>> different
>>             from what they are in all other chip-based packages + people
>> that
>>             want the EC numbers still don't have them
>>
>>   B. fix the ENZYME/ENZYME2PROBE maps so that they are consistent with all
>>      other ENZYME/ENZYME2PROBE maps
>>      - pro: consistency across all other chip-based packages
>>      - con: enzyme names are gone so the user code using the
>> ENZYME/ENZYME2PROBE maps
>>             from ag and ath1121501 will need to be modified to work with
>> ag.db and
>>             ath1121501.db
>>
>>   C. rename the ENZYME/ENZYME2PROBE maps -> ECNAME/ECNAME2PROBE and deprecate
>> the
>>      ENZYME/ENZYME2PROBE maps
>>      - pro: use the standard deprecation procedure for a smooth transition
>> period
>>      - con: people that want the EC numbers right now still don't have them
>> (they'll
>>             need to wait BioC 2.2)
>>
>>   D. fix the ENZYME/ENZYME2PROBE maps and add 2 new maps (e.g.
>> ECNAME/ECNAME2PROBE)
>>      for the mapping between probe ids and enzyme names
>>      - pro: consistency and completeness
>>      - con: the user code using the ENZYME/ENZYME2PROBE maps from ag and
>> ath1121501
>>             will need to use the ECNAME/ECNAME2PROBE maps instead (but here
>> the
>>             impact on the user is not as bad as with B since the data they
>>             have been using so far is still available but under different
>> names)
>>
>>   E. anything else?
>>
>> Thanks for your feedback!
>>
>> H.
>>
>> _______________________________________________
>> Bioc-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
> 
> 
> 
>



More information about the Bioc-devel mailing list