[BioC] Annotation data package is not being created

Law, Annie Annie.Law at nrc-cnrc.gc.ca
Mon Dec 15 15:04:15 MET 2003


Hi John,

I'm not sure what you mean by saying the program stopped because they are 
not mapped to "enzyme annotation data"? I don't think you mean that your
input has to code for enzymes? :)  Could you please clarify the path that
ABPkgBuilder takes
to get the information from such databases such as LocusLink, UniGene,
Golden Path, Gene Ontology, and 
KEGG.  My sample input file is representative of the rest of my data.  
I took my clone IDs and went to the I.M.A.G.E. site and got the
corresponding
genbank accession numbers.  With these numbers I want to ge the unigene
information.

I looked at your example "How to use AnnBuilder" and looked at the Genbank
accession numbers and
compared them to mine.  When I type in for example D90278 into Entrez you
find that it has a locus and one
of the link outs is Gene.  However, when I type in one of my Genbank
accession numbers T65425 into Entrez
a locus is not available and the only useful link is to unigene.  This will
be the case for my data. 
Is it possible that ABPkgBuilder requires that there is a locus for your
Genbank accession number in order
to access information from the other databases (LocusLink, UniGene, Golden
Path, Gene Ontology, and KEGG)?

If ABPkgBuilder does not allow me to get a unigene cluster from a Genbank
Accession number for my clones
are there other functions within Bioconductor that will enable me to achieve
my goal?

thanks very much for your help,
Annie.

-----Original Message-----
From: John Zhang [mailto:jzhang at jimmy.harvard.edu]
Sent: Friday, December 12, 2003 1:56 PM
To: Annie.Law at nrc-cnrc.gc.ca
Subject: RE: [BioC] Annotation data package is not being created


I tried your base file. It stoped because the colones were not mapped to any

enzyme annotation data. I will add some error traping functions in but
meanwhile 
you may try some real data to see how it works for you. The most time
consumming 
parts are source data processing. The size of your base file does not reduce
the 
time of execution much.

>From: "Law, Annie" <Annie.Law at nrc-cnrc.gc.ca>
>To: "'John Zhang'" <jzhang at jimmy.harvard.edu>
>Subject: RE: [BioC] Annotation data package is not being created
>Date: Thu, 11 Dec 2003 16:00:46 -0500
>MIME-Version: 1.0
>X-Keywords: 
>
>Hi John,
>
>Here is my file.
>
>thank you,
>Annie.
>
>
>-----Original Message-----
>From: John Zhang [mailto:jzhang at jimmy.harvard.edu]
>Sent: Thursday, December 11, 2003 3:41 PM
>To: Annie.Law at nrc-cnrc.gc.ca
>Cc: bioconductor at stat.math.ethz.ch
>Subject: Re: [BioC] Annotation data package is not being created
>
>
>Could you send me a copy of your base file so that I can try it to figure
>out 
>what might be wrong? Thanks.
>>
>>I would appreciate help with the following.  I was following the vignette
>>"How to use AnnBuilder".
>>I tried to adapt this to my goal of creating an annotation data package
>with
>>the Unigene Identifiers.  
>>I made some very minor changes and used the file samclonegb2 which is just
>a
>>small text file with 
>>the first column being a list of IMAGE cloneIDs and the second column is a
>>list of GenBank accesion numbers.
>>I used the following lines and the sampclonegb2 seem to load properly and
>>then finally 
>>I got the error message listed below. I get some output files formed for
>>example the XML file is formed
>>but my input data has not been mapped to any of the information from the
>>databases.
>>"Error in "colnames<-"(`*tmp*`, value = colNames) :
>>        attempt to set colnames on object with less than two dimensions"
>>I am not sure what I am missing.
>>
>>Also, my current file sampclonegb2 is very simple in that I have one
>Genbank
>>accession number
>>for each cloneID.  My actual source file contains cases where I have more
>>than one
>>Genbank accession number associated with a cloneid.  What is the best way
>to
>>approach this?
>>
>>thanks very much,
>>Annie.
>>
>>
>>library(AnnBuilder)
>>read.table(file.path(.path.package("AnnBuilder"), "data", "sampclonegb2"),
>>sep = "\t", header = FALSE, as.is = TRUE)
>>myBase <- file.path(.path.package("AnnBuilder"), "data", "sampclonegb2")
>>myBaseType <- "gb"
>>mySrcUrls <- getSrcUrl("all", organism = "human")
>>mySrcUrls
>>myDir <- tempdir()
>>if (.Platform$OS.type == "unix") {
>>fromWeb <- TRUE
>>} else {
>>fromWeb <- FALSE
>>}
>>if (.Platform$OS.type != "windows") {
>>ABPkgBuilder(baseName = myBase, srcUrls = mySrcUrls, baseMapType =
>>myBaseType,
>>otherSrc = NULL, pkgName = "abmyPkg", pkgPath = myDir,
>>organism = "human", version = "1.1.0", makeXML = TRUE,
>>author = list(author = "Annie", maintainer = "myname at myemail.com"),
>>fromWeb =fromWeb)}
>>
>>"It may take me a while to process the data. Be patient!"
>>Error in "colnames<-"(`*tmp*`, value = colNames) :
>>        attempt to set colnames on object with less than two dimensions
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
>Jianhua Zhang
>Department of Biostatistics
>Dana-Farber Cancer Institute
>44 Binney Street
>Boston, MA 02115-6084
>

Jianhua Zhang
Department of Biostatistics
Dana-Farber Cancer Institute
44 Binney Street
Boston, MA 02115-6084



More information about the Bioconductor mailing list