[BioC] Error message from AnnBuilder
Ting-Yuan Liu
tliu at fhcrc.org
Fri Mar 31 23:00:09 CEST 2006
Hi Hua,
On Wed, 29 Mar 2006, Hua Weng wrote:
> Hi Ting-Yuan:
>
> As you suggested, I tried to use "live link" and get files through the web.
> First I set source URL like this:
> mySrcUrls <- c(EG="ftp://ftp.ncbi.nih.gov/gene/DATA",
> UG="ftp://ftp.ncbi.nih.gov/repository/UniGene/Bos_taurus/Bt.data.gz",
> GP="ftp://hgdownload.cse.ucsc.edu/goldenPath/currentGenomes",
> GO="http://www.godatabase.org/dev/database/archive/latest",
> KEGG="ftp://ftp.genome.ad.jp/pub/kegg/pathways")
> I got the following errors:
> Error in srcUrls[["KEGGGENOME"]] : subscript out of bounds
> In addition: Warning message:
> Organism Bos taurus is not supported by GoldenPath (GP). in:
> getUCSCUrl(organism)
This is not what I suggested. What I said is NOT using the baseMapType
argument in the ABPkgBuilder. In other words, you should run ABPkgBuilder
like this:
ABPkgBuilder(baseName="hgu95av2.GeneBankID",
baseMapType="gbNRef",
pkgName="hgu95av2",
pkgPath=".",
organism="Homo sapiens",
version="0.0.0",
author=list(
authors="Ting-Yuan Liu, ChenWei Lin, Seth Falcon,
Jianhua Zhang, James W. MacDonald",
maintainer="Ting-Yuan Liu <tliu at fhcrc.org>"
)
See? I didn't use the argument baseMapType and fromWeb. This is what I
mean. AnnBuilder knows how to get the correct source urls, so that you
don't have to worry about it.
>
> I changed the source URL as following, and then it works.
> mySrcUrls <- c(EG="ftp://ftp.ncbi.nih.gov/gene/DATA",
> UG="ftp://ftp.ncbi.nih.gov/repository/UniGene/Bos_taurus/Bt.data.gz",
> GO="http://www.godatabase.org/dev/database/archive/latest")
> But I still get seven environments and nothing useful back. I checked my
> Gene bank accession IDs, and they can map to Gene ID and UniGene ID and
> possible GO information. Is this because AnnBuilder cannot handle the
> organism other than Human, mouse and rat? Have you tested to build
> annotation package for other organism such as Cow, Rice..?
>
You mean you check your Genebank accession ids on the NCBI website, right?
Actually, not all the information you can find on the NCBI website are
included in the NCBI downlaodable files, but AnnBuilder builds packages
according to these downloadable files.
We tried to use ABPkgBuilder to build annotation packages for Affymetrix
grape chips, but without success. We also have the same problem in
building Arabidopsis annotation packages in ABPkgBuilder, and therefore
we develop a new function in AnnBuilder to build that according to the
data we found outside of NCBI. It seems that your case is very similar to
Arabidopsis.
One thing you can do, if there is no confidential issues on it, is sending
me the basefiles (not send to the list, please) and I will try to build
the package and see if I have the same problem.
HTH,
Ting-Yuan
> Thank you very much for your advice.
> Hua
>
> -----Original Message-----
> From: Ting-Yuan Liu [mailto:tliu at fhcrc.org]
> Sent: Wednesday, March 29, 2006 12:07 PM
> To: Hua Weng
> Cc: bioconductor at stat.math.ethz.ch
> Subject: RE: Error message from AnnBuilder
>
>
> Hi Hua,
>
> On Tue, 28 Mar 2006, Hua Weng wrote:
>
> > My questions are:
> > 1)If I provide more local annotation files, may I get more information
> back?
>
> Using more annotation files (basefiles) could improves the mapping
> results, but not the number of environments. It is always a good idea to
> provide as many basefiles as you could.
>
> I am not sure why you didn't get many environments. Could you try not to
> use local files to build annotation packages? I mean you should remove
> the baseMapType and fromWeb arguments so that ABPkgBuilder could download
> the data from the web.
>
> > 2)I didn't get any GO term back, does it mean these genes for cow don't
> have
> > any GO?
>
> No. If cow doesn't have any associated Go information, you will get an
> environment whose values are all NAs. See (1) for details to get more
> environments.
>
> > 3)If I map these cow's Gene bank Accession ID to Mus musculus, can I get
> > some useful information back? Do I need to change cow's gene bank
> accession
> > IDs to Mouse's gene bank accession IDs?
>
> I am not sure if I understand what you mean here. Are you interested in
> the homology between cow and mouse? The package btahomology might be
> what you want. You can find it at
> http://www.bioconductor.org/packages/data/annotation/1.8/html/btahomology.ht
> ml
>
> HTH,
> Ting-Yuan
> ______________________________________
> Ting-Yuan Liu
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> Seattle, WA, USA
> ______________________________________
>
> >
> > Thank you very much for your response!
> >
> > Hua
> >
> > -----Original Message-----
> > From: Ting-Yuan Liu [mailto:tliu at fhcrc.org]
> > Sent: Tuesday, March 28, 2006 3:21 PM
> > To: Hua Weng
> > Cc: bioconductor at stat.math.ethz.ch
> > Subject: RE: Error message from AnnBuilder
> >
> >
> > Hi Hua,
> >
> > The "subscript out of bounds" bug had been fixed in the developmental
> > AnnBuilder, I believe. Please have a try.
> >
> > HTH,
> > Ting-Yuan
> > ______________________________________
> > Ting-Yuan Liu
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > Seattle, WA, USA
> > ______________________________________
> >
> > On Tue, 28 Mar 2006, Hua Weng wrote:
> >
> > >
> > > Dear List and Ting-Yuan:
> > > I finally decided to use AnnBuilder on Linux server. And I got sample
> data
> > > set, thgu95a, worked and I successfully installed the annotation
> package.
> > > But when I try to use one data set for cow (Bos taurus), I got the
> > following
> > > error message:
> > >
> > > Error in all(is.na(annotation[, "GO"])) : subscript out of bounds
> > >
> > > I don't know what does this error mean? Is it because my data set cannot
> > get
> > > any GO term?
> > >
> > > The code is as follow:
> > > > library("AnnBuilder")
> > > > myBase <- file.path("cluster6_Asitha_Bt.txt")
> > > > myDir <- "/home/hua/project/bioconductor/AnnBuilder/"
> > > > myBaseType="gbNRef"
> > > > mySrcUrls <-
> > > c(EG="file:////home/hua/project/bioconductor/AnnBuilder/gene_DATA",
> > > +
> > >
> >
> UG="file:////home/hua/project/bioconductor/AnnBuilder/UniGene/Bos_taurus/Bt.
> > > data.gz",
> > > +
> > >
> >
> GO="file:////home/hua/project/bioconductor/AnnBuilder/go_200603-termdb.rdf-x
> > > ml.gz")
> > > > ABPkgBuilder(baseName = myBase, srcUrls = mySrcUrls, baseMapType =
> > > myBaseType,
> > > + pkgName = "AsithaBtPkg", pkgPath = myDir, organism = "Bos taurus",
> > version
> > > = "1.1.0",
> > > + author = list(author = "Hua Weng", maintainer = "Hua Weng
> > > <hweng at biochem.okstate.edu>"), fromWeb = False)
> > >
> > > The following is data set look like:
> > > > myBase
> > > [1] "cluster6_Asitha_Bt.txt"
> > > > read.table(myBase, sep="\t", header=FALSE, as.is=TRUE)
> > > V1 V2
> > > 1 a2g09 NM_174062
> > > 2 g1o22 <NA>
> > > 3 a1d09 XM_879288
> > > 4 a1e10 NM_175825
> > > 5 g4n11 XM_873598
> > > 6 g1b02 <NA>
> > > 7 f7c16 <NA>
> > > 8 a1h04 XM_580317
> > > 9 f5l19 BC102351
> > > 10 g4p13 XM_879908
> > > 11 g4k22 NM_173968
> > > 12 f6d15 XM_874804
> > > 13 g4l22 XM_615696
> > > 14 g1h03 XM_873394
> > > 15 a1d10 NM_174658
> > > 16 f6c14 <NA>
> > > 17 g4k13 NM_001034575
> > > 18 f7k05 XM_868174
> > > 19 g4k23 <NA>
> > > 20 f6k09 NM_001007815
> > > 21 f6d16 NM_174792
> > > 22 g4f07 <NA>
> > > 23 f5k24 BT021073
> > >
> > > The first column is probe ID and the second column is Gene Bank
> accession
> > ID
> > > for Bos taurus. If I want to get the annotation for Mus musculus, can I
> > > still use the Gene bank accession ID for Bos Taurus?
> > >
> > > > sessionInfo()
> > > R version 2.2.0, 2005-10-06, i686-pc-linux-gnu
> > >
> > > attached base packages:
> > > [1] "tools" "methods" "stats" "graphics" "grDevices" "utils"
> > > [7] "datasets" "base"
> > >
> > > other attached packages:
> > > GO AnnBuilder annotate XML Biobase
> > > "1.10.0" "1.8.0" "1.8.0" "0.99-6" "1.8.0"
> > >
> > > I highly appreciate any comments and suggestions.
> > >
> > > Thanks,
> > > Hua
> > >
> > >
> > > -----Original Message-----
> > > From: Ting-Yuan Liu [mailto:tliu at fhcrc.org]
> > > Sent: Friday, March 24, 2006 11:48 AM
> > > To: Hua Weng
> > > Cc: bioconductor at stat.math.ethz.ch
> > > Subject: Re: AnnBuilder
> > >
> > >
> > > Hi Hua,
> > >
> > > Yes, you could run AnnBuilder in Windows system. That is not what I
> > > usually do, but I tried and succeed. However, my R in the windows
> > > machine is "built from source" (see section 3.1 of the manual "R
> > > Installation and Administration") and it might be a little different
> from
> > > your R (which is built from the binary installer, I guess.) Someone
> > > reported to me that it is unable to run AnnBuilder in the Windows
> system,
> > > but it did work in my machine. Therefore, you can try first to see if
> you
> >
> > > can build annotation packages from the binary-installed R. If not, you
> > > should switch to the source-installed R.
> > >
> > > HTH,
> > > Ting-Yuan
> > > ______________________________________
> > > Ting-Yuan Liu
> > > Program in Computational Biology
> > > Division of Public Health Sciences
> > > Fred Hutchinson Cancer Research Center
> > > Seattle, WA, USA
> > > ______________________________________
> > >
> > > On Wed, 22 Mar 2006, Hua Weng wrote:
> > >
> > > > Hi, Bioconductor list and Ting-Yuan:
> > > >
> > > >
> > > >
> > > > I have problems in using AnnBuilder package.
> > > >
> > > >
> > > >
> > > > 1) May I use Windows based R environment to run ABPkgBuilder? I
> > > > haven't been successfully run this command. I saw there is a condition
> > > > before this command is "if(.Platform$OS != "windows" &&
> interactive())",
> > > > Does this mean this command cannot run on windows platform?
> > > >
> > > > 2) I also tried to install AnnBuilder in R2.2.0 on Linux server.
> > But
> > > I
> > > > haven't been successfully installed it. The problem is before I could
> > > > install XML package, it gave me error message "**** You should use
> a
> > > > recent version of libxml2, i.e. 2.6.22 or higher ****". And when I
> > tried
> > > to
> > > > install 'libxml2', I got the following error:
> > > >
> > > > > install.packages("libxml2")
> > > >
> > > > Warning in download.packages(pkgs, destdir = tmpd, available =
> > available,
> > > :
> > > >
> > > > no package 'libxml2' at the repositories
> > > >
> > > > So I want to ask how I can successfully install 'libxml2' on Linux
> > server?
> > > >
> > > >
> > > >
> > > > > sessionInfo()
> > > >
> > > > R version 2.2.0, 2005-10-06, i686-pc-linux-gnu
> > > >
> > > >
> > > >
> > > > attached base packages:
> > > >
> > > > [1] "methods" "stats" "graphics" "grDevices" "utils"
> > "datasets"
> > > >
> > > > [7] "base"
> > > >
> > > >
> > > >
> > > > 3) I found that UniGene source URL always point to 'Homo
> sapiens'
> > > data
> > > > even for the organism other than 'Homo sapiens'. Is that true?
> > > >
> > > >
> > > >
> > > > Thanks for your attention!
> > > >
> > > >
> > > >
> > > > Hua Weng
> > > >
> > > > Microarray Core Facility
> > > >
> > > > Oklahoma State University
> > > >
> > > > Department of Biochemistry and Molecular Biology
> > > >
> > > > 246 Noble Research Center
> > > >
> > > > Stillwater, OK 74078
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>
More information about the Bioconductor
mailing list