[Bioc-devel] biomaRt cannot list marts when going through a mirror web site
Nicolas Delhomme
delhomme at embl.de
Wed May 2 10:52:19 CEST 2012
Hi Steffen, hi Wolfgang,
When trying to list the marts available from an ensembl mirror, I get the following:
listMarts(host="uswest.ensembl.org")
Space required after the Public Identifier
SystemLiteral " or ' expected
SYSTEM or PUBLIC, the URI is missing
Error: 1: Space required after the Public Identifier
2: SystemLiteral " or ' expected
3: SYSTEM or PUBLIC, the URI is missing
This is triggered by this line:
registry = bmRequest(request = request, ssl.verifypeer = ssl.verifypeer, verbose = verbose)
in the listMarts function.
Looking at the bmRequest function, it uses the getURL function of the RCurl package. This function is the culprit:
## the request as computed by listMarts
request = "http://uswest.ensembl.org:80/biomart/martservice?type=registry&requestid=biomaRt"
getURL(request, ssl.verifypeer = TRUE)
[1] "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>302 Found</title>\n</head><body>\n<h1>Found</h1>\n<p>The document has moved <a href=\"http://www.ensembl.org/biomart/martservice?type=registry&requestid=biomaRt;redirect=mirror;source=uswest.ensembl.org\">here</a>.</p>\n</body></html>\n"
As you an see it returns a 302 relocation page, i.e. the website is mirrored to "www.ensembl.org" in my case.
Adding a followlocation=TRUE argument to that command solves the problem:
getURL(request, ssl.verifypeer = TRUE, followlocation=TRUE)
[1] "\n<MartRegistry>\n <MartURLLocation database=\"ensembl_mart_66\" default=\"1\" displayName=\"Ensembl Genes 66\" host=\"www.ensembl.org\" includeDatasets=\"\" martUser=\"\" name=\"ENSEMBL_MART_ENSEMBL\" path=\"/biomart/martservice\" port=\"80\" serverVirtualSchema=\"default\" visible=\"1\" />\n <MartURLLocation database=\"sequence_mart_66\" default=\"\" displayName=\"Sequence\" host=\"www.ensembl.org\" includeDatasets=\"\" martUser=\"\" name=\"ENSEMBL_MART_SEQUENCE\" path=\"/biomart/martservice\" port=\"80\" serverVirtualSchema=\"default\" visible=\"\" />\n <MartURLLocation database=\"ontology_mart_66\"
... truncated
Can you please add that additional parameter to the getURL call?
Steffen, if you're in a location that uses "uswest.ensembl.org" as a mirror (i.e. US west coast I guess, Seattle for sure :-)), you can use "www.ensembl.org" as a host to reproduce that error instead.
I have only tumbled onto that one, but there might be more occurrence in the code that need adapting. I'll try to figure that out.
My session info (R 2.15.0 with useDevel(TRUE)):
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] customCDF_0.99.3 XML_3.9-4 RSQLite_0.11.1
[4] RCurl_1.91-1 bitops_1.0-4.1 Rsamtools_1.9.8
[7] GEOquery_2.23.1 GenomicFeatures_1.9.6 GenomicRanges_1.9.9
[10] DBI_0.2-5 Biostrings_2.25.3 IRanges_1.15.7
[13] AnnotationDbi_1.19.4 vsn_3.25.0 makecdfenv_1.35.0
[16] gcrma_2.29.0 BiocInstaller_1.5.7 biomaRt_2.13.0
[19] affy_1.35.1 Biobase_2.17.5 BiocGenerics_0.3.0
loaded via a namespace (and not attached):
[1] affyio_1.25.0 BSgenome_1.25.1 grid_2.15.0
[4] lattice_0.20-6 limma_3.13.1 preprocessCore_1.19.0
[7] rtracklayer_1.17.0 splines_2.15.0 stats4_2.15.0
[10] tools_2.15.0 zlibbioc_1.3.0
Cheers,
Nico
---------------------------------------------------------------
Nicolas Delhomme
Genome Biology Computational Support
European Molecular Biology Laboratory
Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
More information about the Bioc-devel
mailing list