[BioC] Is there anyway to map a set of EST id's to genBank accessions?
James F. Reid
james.reid at ifom-ieo-campus.it
Tue Feb 2 22:42:22 CET 2010
Hi Peter,
you could use the makeDBPackage function from the AnnotationDbi package
using baseMapType = "gb" for genbank entries which would be either the
3' or 5' prime accession numbers of the clones (i.e. 5':T39154,
3':T40438 from your first entry). You could simply use the 3' ones
(3ACC) but if you want to use both you could run it twice to highlight
inconsistencies in gene assignment for any given entry.
HTH.
J.
On 02/02/2010 11:09, Peter Waltman wrote:
> Hi -
>
> I'm trying to annotate a custom cDNA array, available from GEO
> (specifically, GDS1761). It uses a custom annotation (GPL1290) which
> provides the EST's that were mapped to the different spots, i.e. the first 5
> rows are:
> ID NAME CLONE_ID 5ACC 3ACC GB_LIST
> 1 SID W 60204, Homo sapiens C2H2 zinc finger protein pseudogene, mRNA
> sequence [5':T39154, 3':T40438] IMAGE:60204 T39154 T40438
> T39154,T40438
> 2 EST Chr.X [60298, (D), 5':T39213, 3':T40480] IMAGE:60298
> T39213 T40480 T39213,T40480
> 3 RPL3 Ribosomal protein L3 Chr.22 [60436, (EW), 5':T39295,
> 3':T40510] IMAGE:60436 T39295 T40510 T39295,T40510
> 4 ESTSID 60474, [5':T39311, 3':T40516] IMAGE:60474 T39311
> T40516 T39311,T40516
> 5 SID 60218, [5':T39165, 3':T40450] IMAGE:60218 T39165
> T40450 T39165,T40450
> 6 EST Chr.18 [60268, (IR), 5':T39192, 3':T40467] IMAGE:60268
> T39192 T40467 T39192,T40467
>
> The T##### id's are the EST accession id's, which can be queried from NCBI,
> i.e. for the first row
> http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Search&db=Nucleotide&term=T39213will
> find one result,
> http://www.ncbi.nlm.nih.gov/nucest/T39213.1?ordinalpos=1&itool=EntrezSystem2.PEntrez.Sequence.Sequence_ResultsPanel.Sequence_RVDocSum
>
> On the result that's returned, you can find a genBank gi # that can be used
> to find the gene annotation (in this case 646973) , but I can't figure out
> any way to do this for a large number of EST accessions (>9600).
>
> Any suggestions?
>
> Thanks!
>
> Peter
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list