[BioC] AnnBuilder and Kegg
John Zhang
jzhang at jimmy.harvard.edu
Wed Nov 22 15:29:18 CET 2006
>After the program finishes I eventually have an annotation package for
>my data but it does not contain any kegg data.
Look at your code. The organism name is wrong (Mus Musclusus rather than Mus
musclus).
>
>When I install my package (outside R, under linux)
>I have this:
>
>*******************************************************************************
***
>R CMD INSTALL lgtc201106
>* Installing *source* package 'lgtc201106' ...
>** R
>** data
>** moving datasets to lazyload DB
>** help
> >>> Building/Updating help pages for package 'lgtc201106'
> Formats: text html latex example
> lgtc201106 text html latex
> lgtc201106ACCNUM text html latex example
> lgtc201106CHR text html latex example
> lgtc201106ENZYME text html latex example
> lgtc201106GENENAME text html latex example
> lgtc201106GO text html latex example
> lgtc201106GO2ALLPROBES text html latex example
> lgtc201106GO2PROBE text html latex example
> lgtc201106LOCUSID text html latex example
> lgtc201106MAP text html latex example
> lgtc201106OMIM text html latex example
> lgtc201106ORGANISM text html latex example
> lgtc201106PATH text html latex example
> lgtc201106PMID text html latex example
> lgtc201106PMID2PROBE text html latex example
> lgtc201106QC text html latex
> lgtc201106QCDATA text html latex
> lgtc201106REFSEQ text html latex example
> lgtc201106SUMFUNC text html latex example
> lgtc201106SYMBOL text html latex example
> lgtc201106UNIGENE text html latex example
>** buil0ding package indices ...
>* DONE (lgtc201106)
>*******************************************************************************
**********
>
>and when I call the library in R
>*******************************************************************************
**********
>library(lgtc201106)
>lgtc201106()
>
>
>Quality control information for lgtc201106
>Date built: Created: Wed Nov 22 13:12:38 2006
>
>Number of probes: 23233
>Probe number missmatch: None
>Probe missmatch: None
>Mappings found for probe based rda files:
> lgtc201106ACCNUM found 22512 of 23233
> lgtc201106CHR found 18757 of 23233
> lgtc201106ENZYME found 0 of 23233
> lgtc201106GENENAME found 18674 of 23233
> lgtc201106GO found 0 of 23233
> lgtc201106GO found 0 of 23233
> lgtc201106LOCUSID found 18977 of 23233
> lgtc201106MAP found 15808 of 23233
> lgtc201106OMIM found 433 of 23233
> lgtc201106PATH found 0 of 23233
> lgtc201106PMID found 18967 of 23233
> lgtc201106REFSEQ found 14098 of 23233
> lgtc201106SUMFUNC found 0 of 23233
> lgtc201106SYMBOL found 18977 of 23233
> lgtc201106UNIGENE found 18149 of 23233
>Mappings found for non-probe based rda files:
> lgtc201106GO2ALLPROBES found 6994
> lgtc201106GO2PROBE found 5360
> lgtc201106ORGANISM found 1
> lgtc201106PMID2PROBE found 92300
>
>kegg <- as.list(lgtc201106PATH2PROBE)
>Error: object "lgtc201106PATH2PROBE" not found
>Error in as.list(lgtc201106PATH2PROBE) : unable to find the argument 'x' in
selecting a method for function 'as.list'
>
>
>*******************************************************************************
****************************************
>
>thanks
>
>P
>
>
>
>-----Original Message-----
>From: John Zhang [mailto:jzhang at jimmy.harvard.edu]
>Sent: Wed 11/22/2006 2:25 PM
>To: jzhang at jimmy.harvard.edu; Pedotti, P. (HKG)
>Cc: bioconductor at stat.math.ethz.ch
>Subject: Re: [BioC] AnnBuilder and Kegg
>
>
>>thank you for the suggestions.
>>However, I downloaded the newest version of AnnBuilder
>>and still I had the same problem in kegg connection.
>
>Have you looked at the built package to see if you get any pathway annotation.
>The warning messages like:
>
>Failed to get data from URL:
>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00010.gene
>
>just tell you that there are name miss-match in KEGG's data files but the data
>package should still build.
>
>I will try to write more informative warning messages when I get the chance.
>
>
>
>
>
>
>
>
>
>
>
>
>>
>>******************************************************************************
*
>***********************************
>>
>>sessionInfo()
>>Version 2.3.1 (2006-06-01)
>>i386-pc-linux-gnu
>>
>>attached base packages:
>>[1] "tools" "methods" "stats" "graphics" "grDevices" "utils"
>>[7] "datasets" "base"
>>
>>other attached packages:
>> GO AnnBuilder RSQLite DBI annotate XML
>>Biobase
>> "1.12.0" "1.12.0" "0.4-1" "0.1-10" "1.10.0" "0.99-7"
>>"1.10.0"
>>
>>mySrcUrls <- c(GO=
>>"http://www.godatabase.org/dev/database/archive/latest/go_2
>>00605-termdb.rdf-xml.gz",KEGG="ftp://ftp.genome.ad.jp/pub/kegg/pathways",YG="f
t
>p
>://genome-ftp.stanford.edu/pub/yeast/data_download/",HG="ftp://ftp.ncbi.nih.gov
/
>pub/HomoloGene/old/hmlg.ftp",EG="ftp://ftp.ncbi.nlm.nih.gov/gene/DATA",IPI="ftp
:
>//ftp.ebi.ac.uk/pub/databases/IPI/current/",YEAST="ftp://ftp.yeastgenome.org/pu
b
>/yeast/sequence_similarity/domains/",KEGGGENOME="ftp://ftp.genome.ad.jp/pub/keg
g
>/tarfiles/genome",PFAM="ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_relea
s
>e/Pfam-A.full.gz")
>>ppbase<- file.path(.path.package("AnnBuilder"), "data",
>>"lgtc.ids.1.txt")
>>myBaseType="gb"
>>ABPkgBuilder(baseName=ppbase,
>>+ srcUrls = mySrcUrls,
>>+ baseMapType = myBaseType,
>>+ pkgName = "lgtc.221106",
>>+ pkgPath = '.',
>>+ organism ="mouse",
>>+ version ="1.1.0",
>>+ author = list(author = "Paola Pedotti",
>>+ maintener ="Paola Pedotti <p.pedotti at lumc.nl>")
>>+ )
>>
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00010.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00020.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00030.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00031.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00040.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00051.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00052.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00053.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00061.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00062.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00071.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00072.gene
>>Failed to get data from URL:
>>ftp://ftp.genome.ad.jp/pub/kegg/pathways//00100.gene
>>......................
>>
>>
>>******************************************************************************
*
>****************************
>>
>>
>>Do you have other suggestions?
>>
>>thanks
>>
>>Paola
>>
>>
>>On Tue, 2006-11-21 at 12:13 -0500, John Zhang wrote:
>>> >
>>> >Hi everybody,
>>> >I am trying to annotate my dataset (home spotted array, two colors,
>>> >mice) using AnnBuilder.
>>> >Every time I run the program the connection with the kegg
>>> >website is not working, so I am able to build the annotation
>>> >package but not for the kegg pathways. Does anybody know how to
>>> >fix this problem or did anybody find a way to by pass it (like
>>> >downloading a list of accession numbers and corresponding pathways)?
>>> >here my script:
>>>
>>> I guess the best thing for you to do is to update your R and BioC packages.
>The
>>> released version of AnnBuilder is 1.12.0 while you have 1.10.0 on your
>machine.
>>>
>>>
>>>
>>> >
>>>
>>******************************************************************************
*
>>> **********************
>>> >
>>> >library(AnnBuilder)
>>> >#Loading required package: Biobase
>>> >#Loading required package: tools
>>> >#Welcome to Bioconductor
>>> ># Vignettes contain introductory material. To view,
>>> ># simply type: openVignette()
>>> ># For details on reading vignettes, see
>>> ># the openVignette help page.
>>> >#Loading required package: annotate
>>> >
>>> >library(GO)
>>> >
>>> >sessionInfo()
>>> >
>>> >#Version 2.3.1 (2006-06-01)
>>> >#i386-pc-linux-gnu
>>> >#
>>> >#attached base packages:
>>> >#[1] "splines" "tools" "methods" "stats" "graphics"
>>> >#"grDevices"
>>> >#[7] "utils" "datasets" "base"
>>> >#
>>> >#other attached packages:
>>> >#
>>> ># globaltest vsn limma multtest
>>> ># "4.2.0" "1.10.0" "2.7.3" "1.10.2"
>>> ># survival affydata affy affyio
>>> ># "2.20" "1.8.0" "1.10.0" "1.0.0"
>>> ># KEGG GO AnnBuilder RSQLite
>>> ># "1.12.0" "1.12.0" "1.10.0" "0.4-1"
>>> ># DBI annotate XML Biobase
>>> ># "0.1-10" "1.10.0" "0.99-7" "1.10.0"
>>> >
>>> >
>>> >mySrcUrls <- getSrcUrl("all", organism="Mus Musclusus")
>>> >
>>> >base<- file.path(.path.package("AnnBuilder"), "data", "lgtc.ids.1.txt")
>>> >
>>> >myBaseType<- "gbNRef"
>>> >ABPkgBuilder(baseName=base,
>>> > srcUrls = mySrcUrls,
>>> > baseMapType = myBaseType,
>>> > pkgName = "lgtc201106",
>>> > pkgPath = ".",
>>> > organism ="Mus Musclusus",
>>> > version ="1.1.0",
>>> > author = list(author = "Paola Pedotti",
>>> > maintener ="Paola Pedotti <p.pedotti at lumc.nl>")
>>> > )
>>> >
>>> >
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07214.gene
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07215.gene
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07216.gene
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07217.gene
>>> >#Failed to get data from URL:
>>> >ftp://ftp.genome.ad.jp/pub/kegg/pathways//07218.gene
>>> >#[1] "0 2 2"
>>> >#Warning message:
>>> >#cannot open file
>>> >'/usr/local/lib/R/site-library/AnnBuilder/templates/PKGNAMEGO.1.Rd',
>>> >reason 'No such file or directory'
>>> >#The following data sets have been added to the database and will be
>>> >removed:
>>> ># [1] "./lgtc161106/data/lgtc161106ACCNUM.rda"
>>> ># [2] "./lgtc161106/data/lgtc161106CHR.rda"
>>> ># [3] "./lgtc161106/data/lgtc161106ENZYME.rda"
>>> ># [4] "./lgtc161106/data/lgtc161106GENENAME.rda"
>>> ># [5] "./lgtc161106/data/lgtc161106GO.1.rda"
>>> ># [6] "./lgtc161106/data/lgtc161106GO2ALLPROBES.rda"
>>> ># [7] "./lgtc161106/data/lgtc161106GO2PROBE.rda"
>>> ># [8] "./lgtc161106/data/lgtc161106GO.rda"
>>> ># [9] "./lgtc161106/data/lgtc161106LOCUSID.rda"
>>> >#[10] "./lgtc161106/data/lgtc161106MAPCOUNTS.rda"
>>> >#[11] "./lgtc161106/data/lgtc161106MAP.rda"
>>> >#[12] "./lgtc161106/data/lgtc161106OMIM.rda"
>>> >#[13] "./lgtc161106/data/lgtc161106ORGANISM.rda"
>>> >#[14] "./lgtc161106/data/lgtc161106PATH.rda"
>>> >#[15] "./lgtc161106/data/lgtc161106PMID2PROBE.rda"
>>> >#[16] "./lgtc161106/data/lgtc161106PMID.rda"
>>> >#[17] "./lgtc161106/data/lgtc161106QCDATA.rda"
>>> >#[18] "./lgtc161106/data/lgtc161106QC.rda"
>>> >#[19] "./lgtc161106/data/lgtc161106REFSEQ.rda"
>>> >#[20] "./lgtc161106/data/lgtc161106SUMFUNC.rda"
>>> >#[21] "./lgtc161106/data/lgtc161106SYMBOL.rda"
>>> >#[22] "./lgtc161106/data/lgtc161106UNIGENE.rda"
>>> >#Warning message:
>>> >#Can't
>>> >copy /usr/local/lib/R/site-library/AnnBuilder/templates/PKGNAMEGO.1.Rd
>>> >in: copyTemplates(repList, pattern, pkgName, pkgPath)
>>> >
>>>
>>******************************************************************************
*
>>> **********************
>>> >
>>> >
>>> >thank you in advance
>>> >
>>> >Paola
>>> >
>>> >
>>> >
>>> >_______________________________________
>>> >Center for Human and Clinical Genetics
>>> >Leiden University Medical Center
>>> >Postzone: S-04-P, Postbus 9600
>>> >2300 RC Leiden, The Netherlands
>>> >Telephone: +31 71 526 9440
>>> >Fax: +31 71 526 8285
>>> >
>>> >_______________________________________________
>>> >Bioconductor mailing list
>>> >Bioconductor at stat.math.ethz.ch
>>> >https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> >Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>> Jianhua Zhang
>>> Department of Medical Oncology
>>> Dana-Farber Cancer Institute
>>> 44 Binney Street
>>> Boston, MA 02115-6084
>>>
>>
>>_______________________________________________
>>Bioconductor mailing list
>>Bioconductor at stat.math.ethz.ch
>>https://stat.ethz.ch/mailman/listinfo/bioconductor
>>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>Jianhua Zhang
>Department of Medical Oncology
>Dana-Farber Cancer Institute
>44 Binney Street
>Boston, MA 02115-6084
>
>
>
> [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
Jianhua Zhang
Department of Medical Oncology
Dana-Farber Cancer Institute
44 Binney Street
Boston, MA 02115-6084
More information about the Bioconductor
mailing list