[BioC] Genbank to Unigene IDs
Dave Waddell
dwaddell at nutecsciences.com
Thu Apr 15 16:37:07 CEST 2004
I tried running this but got an error:
> library(AnnBuilder)
> myBaseType <- "gb"
> myDir <- "C:/Temp"
> myBase <- "C:/Temp/tempFile.txt"
> mySrcUrls <- getSrcUrl(src = "ALL",organism = "human")
> ABPkgBuilder(baseName = myBase, srcUrls = mySrcUrls, baseMapType =
+ myBaseType, pkgName = "Hum_Agi1A", pkgPath = myDir,organism =
+ "human", version = "1.0",
+ makeXML = TRUE, author = list(author = "dpritch", maintainer =
+ "dpritch at u.washington.edu"), fromWeb = TRUE)
[1] "It may take me a while to process the data. Be patient!"
Warning message:
cannot open file `C:/R/rw1090beta/library/AnnBuilder/temp/tempOut31783'
Error in unifyMappings(base, ll, ug, otherSrc, fromWeb) :
Failed to get or parse LocusLink data because of:
Error in file(file, "r") : unable to open connection
I had changed this directory from "Read Only" and checked that I had write
permissions from within R:
> setwd("C:/R/rw1090beta/library/AnnBuilder/temp")
> dir()
[1] "file24842Tgo.xml" "README"
> write("Hello")
> dir()
[1] "data" "file24842Tgo.xml" "README"
I get the same error if I run
example("ABPkgBuilder")
Any suggestions?
Dave.
-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of A.J. Rossini
Sent: Thursday, April 15, 2004 8:48 AM
To: Gordon Smyth
Cc: BioC Mailing List
Subject: Re: [BioC] Genbank to Unigene IDs
Gordon Smyth <smyth at wehi.edu.au> writes:
> I have a list of GenBank IDs for which I'd like the corresponding
> Unigene cluster IDs. What is the easiest way to do this using
> Bioconductor functions? (I've scanned annotate and AnnBuilder help and
> vignettes, although way too quickly.)
>
> For the sake of being specific, here's a concrete example. What's
> Unigene for GB="NM_004551"?
Here's what I'd do (more of a chip-style analysis than instant
WWW-based gratification, which might also be possible):
1. First create a tab-separated 2 column file, first row dummy
probe IDs (could be real or not), second row GB ID's. So, you'd have
1 row in a file called "Dummy.tsv"
1 NM_004551
2. Have a script similar to:
library(AnnBuilder)
myBaseType <- "gb"
# myDir maps the directory where you want the data package built ---
# obviously this should be changed for the directory structure on the
# linux box
myDir <- "C:/DavidsData/Annotation_Folders"
# myBase maps the file that contains the mapping of Agilent feature
# numbers to GenBank ID's
myBase <- "C:/DavidsData/Annotation_Folders/Dummy.tsv"
#use AnnBuilder internal lists of data sources
mySrcUrls <- getSrcUrl(src = "ALL",organism = "human")
#invoke ABPkgBuilder
ABPkgBuilder(baseName = myBase, srcUrls = mySrcUrls, baseMapType =
myBaseType, pkgName = "Hum_Agi1A", pkgPath = myDir,
organism =
"human", version = "1.0",
makeXML = TRUE, author = list(author = "dpritch",
maintainer =
"dpritch at u.washington.edu"), fromWeb = TRUE)
3. install the package environment
4. use it to find the IDs (can verify the ID mapping with the XML
output file, as well)
best,
-tony
--
rossini at u.washington.edu http://www.analytics.washington.edu/
Biomedical and Health Informatics University of Washington
Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email
CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
More information about the Bioconductor
mailing list