[BioC] Building the tomato annotation library(Affy)
Nianhua Li
nli at fhcrc.org
Wed May 9 20:28:23 CEST 2007
Dear Dr Philip de Groot,
Thanks for the report and sorry for the late reply. The "refLink.txt" error was
intended, but we have changed it to a more informative message. The "sort.list"
error was a bug. It happens when parsing KEGG pathway/enzyme data. The KEGG data
file usually contains both pathway and enzyme data for a given organism. But it
only has pathway data for tomato (actually ESTs only). This broke the code. We
have updated AnnBuilder. Please try the latest one in the bioc 2.1 repository or
donwload it from http://bioconductor.org/packages/2.1/bioc/html/AnnBuilder.html
A test run for Affymetrix tomato array shows that the annotation is very sparse.
Here is the QC data, just FYI:
Mappings found for probe based rda files:
tomatoACCNUM found 10198 of 10209
tomatoCHR found 0 of 10209
tomatoENTREZID found 1288 of 10209
tomatoENZYME found 0 of 10209
tomatoGENENAME found 1288 of 10209
tomatoMAP found 0 of 10209
tomatoPATH found 0 of 10209
tomatoPMID found 778 of 10209
tomatoREFSEQ found 2 of 10209
tomatoSYMBOL found 1288 of 10209
tomatoUNIGENE found 1288 of 10209
Mappings found for non-probe based rda files:
tomatoORGANISM found 1
tomatoPMID2PROBE found 359
We only used the genbank IDs from the Affymetrix csv file, just like what you
did. You can also extract the entrez IDs form the csv file and give it as
"otherSrc" to ABPkgBuilder. It may increase the annotation coverage.
good luck
Martin and Nianhua
More information about the Bioconductor
mailing list