[BioC] Is there anyway to map a set of EST id's to genBank accessions?

James F. Reid james.reid at ifom-ieo-campus.it
Tue Feb 2 22:42:22 CET 2010


Hi Peter,

you could use the makeDBPackage function from the AnnotationDbi package 
using baseMapType = "gb" for genbank entries which would be either the 
3' or 5' prime accession numbers of the clones (i.e. 5':T39154, 
3':T40438 from your first entry). You could simply use the 3' ones 
(3ACC) but if you want to use both you could run it twice to highlight 
inconsistencies in gene assignment for any given entry.

HTH.
J.

On 02/02/2010 11:09, Peter Waltman wrote:
> Hi -
>
> I'm trying to annotate a custom cDNA array, available from GEO
> (specifically, GDS1761).  It uses a custom annotation (GPL1290) which
> provides the EST's that were mapped to the different spots, i.e. the first 5
> rows are:
> ID      NAME    CLONE_ID        5ACC    3ACC    GB_LIST
> 1       SID W 60204, Homo sapiens C2H2 zinc finger protein pseudogene, mRNA
> sequence [5':T39154, 3':T40438]     IMAGE:60204     T39154  T40438
> T39154,T40438
> 2        EST Chr.X [60298, (D), 5':T39213, 3':T40480]   IMAGE:60298
> T39213  T40480  T39213,T40480
> 3       RPL3 Ribosomal protein L3 Chr.22 [60436, (EW), 5':T39295,
> 3':T40510]    IMAGE:60436     T39295  T40510  T39295,T40510
> 4       ESTSID  60474,  [5':T39311, 3':T40516]  IMAGE:60474     T39311
> T40516  T39311,T40516
> 5       SID  60218,  [5':T39165, 3':T40450]     IMAGE:60218     T39165
> T40450  T39165,T40450
> 6        EST Chr.18 [60268, (IR), 5':T39192, 3':T40467] IMAGE:60268
> T39192  T40467  T39192,T40467
>
> The T##### id's are the EST accession id's, which can be queried from NCBI,
> i.e. for the first row
> http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Search&db=Nucleotide&term=T39213will
> find one result,
> http://www.ncbi.nlm.nih.gov/nucest/T39213.1?ordinalpos=1&itool=EntrezSystem2.PEntrez.Sequence.Sequence_ResultsPanel.Sequence_RVDocSum
>
> On the result that's returned, you can find a genBank gi # that can be used
> to find the gene annotation (in this case 646973) , but I can't figure out
> any way to do this for a large number of EST accessions (>9600).
>
> Any suggestions?
>
> Thanks!
>
> Peter
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list