[BioC] GEOquery not working, NCBI changes?

Gad Abraham gabraham at csse.unimelb.edu.au
Wed Mar 3 12:32:14 CET 2010


I'm having trouble getting GEOquery to work recently.

On Ubuntu 9.10/R-2.10.1/BioC 2.6.1, GEOquery 2.11.3, download.file
stops with "cannot open URL",
On Ubuntu 8.04.1/R-2.9.2/BioC 2.4.1, GEOquery 2.8.0, curlPerform stops
with "Server denied you to change to the given directory"

wget downloads the file fine, but requires PASV which download.file()
doesn't support

I suspect that NCBI has changes some server settings, is this a known issue?


> library(GEOquery)
Loading required package: Biobase

Welcome to Bioconductor

  Vignettes contain introductory material. To view, type
  'openVignette()'. To cite Bioconductor, see
  'citation("Biobase")' and for packages 'citation(pkgname)'.

Loading required package: RCurl
Loading required package: bitops
> g <- getGEO("GSE2034")
Found 2 file(s)
trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2034/GSE2034_series_matrix-1.txt.gz'
Error in download.file(sprintf("ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/%s/%s",
  cannot open URL
> sessionInfo()
R version 2.10.1 (2009-12-14)

 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_AU.UTF-8
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] GEOquery_2.11.3 RCurl_1.3-1     bitops_1.0-4.1  Biobase_2.6.1

$ wget ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE2034/GSE2034_series_matrix-1.txt.gz
--2010-03-03 22:08:59--
           => `GSE2034_series_matrix-1.txt.gz'
Resolving ftp.ncbi.nih.gov...
Connecting to ftp.ncbi.nih.gov||:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD /pub/geo/DATA/SeriesMatrix/GSE2034 ... done.
==> SIZE GSE2034_series_matrix-1.txt.gz ... 12800217
==> PASV ... done.    ==> RETR GSE2034_series_matrix-1.txt.gz ... done.
Length: 12800217 (12M)

Gad Abraham
PhD Student, Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham

More information about the Bioconductor mailing list