[R] Problems with getURL (RCurl) to obtain list files of an ftp directory
Duncan Temple Lang
duncan at wald.ucdavis.edu
Fri Oct 12 18:41:39 CEST 2012
Hi Francisco
The code gives me the correct results, and it works for you on a Windows machine.
So while it could be different versions of software (e.g. libcurl, RCurl, etc.),
the presence of the word "squid" in the HTML suggests to me that
your machine/network is using the proxy/caching software Squid. This intercepts
requests and caches the results locally and shares them across
local users. So if squid has retrieved that page for an HTML target (e.g. a browser or
with a Content-Type set to text/html), it may be using that cached copy for your FTP request.
One thing I like to do when debugging RCurl calls is to add
verbose = TRUE
to the .opts argument and then see the information about the communication.
D.
On 10/11/12 11:37 AM, Francisco Zambrano wrote:
> Dear all,
>
> I have a problem with the command 'getURL' from the RCurl package, which I
> have been using to obtain a ftp directory list from the MOD16 (ET, DSI)
> products, and then to download them. (part of the script by Tomislav
> Hengl, spatial-analyst). Instead of the list of files (from ftp), I am
> getting the complete html code. Anyone knows why this might happen?
>
> This are the steps i have been doing:
>
>> MOD16A2.doy<- '
> ftp://ftp.ntsg.umt.edu/pub/MODIS/Mirror/MOD16/MOD16A2.105_MERRAGMAO/'
>
>> items <- strsplit(getURL(MOD16A2.doy,
> .opts=curlOptions(ftplistonly=TRUE)), "\n")[[1]]
>
>> items #results
>
> [1] "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"
> http://www.w3.org/TR/html4/loose.dtd\">\n<!-- HTML listing generated by
> Squid 2.7.STABLE9 -->\n<!-- Wed, 10 Oct 2012 13:43:53 GMT
> -->\n<HTML><HEAD><TITLE>\nFTP Directory:
> ftp://ftp.ntsg.umt.edu/pub/MODIS/Mirror/MOD16/MOD16A2.105_MERRAGMAO/\n</TITLE>\n<STYLE
> type=\"text/css\"><!--BODY{background-color:#ffffff;font-family:verdana,sans-serif}--></STYLE>\n</HEAD><BODY>\n<H2>\nFTP
> Directory: <A HREF=\"/\">ftp://ftp.ntsg.umt.edu</A>/<A
> HREF=\"/pub/\">pub</A>/<A HREF=\"/pub/MODIS/\">MODIS</A>/<A
> HREF=\"/pub/MODIS/Mirror/\">Mirror</A>/<A
> HREF=\"/pub/MODIS/Mirror/MOD16/\">MOD16</A>/<A
> HREF=\"/pub/MODIS/Mirror/MOD16/MOD16A2.105_MERRAGMAO/\">MOD16A2.105_MERRAGMAO</A>/</H2>\n<PRE>\n<A
> HREF=\"../\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dirup.gif\"
> ALT=\"[DIRUP]\"></A> <A HREF=\"../\">Parent Directory</A> \n<A
> HREF=\"GEOTIFF_0.05degree/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"GEOTIFF_0.05degree/\">GEOTIFF_0.05degree</A>
> . . . . . . . Jun 3 18:00 \n<A HREF=\"GEOTIFF_0.5degree/\"><IMG
> border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"GEOTIFF_0.5degree/\">GEOTIFF_0.5degree</A>. .
> . . . . . . Jun 3 18:01 \n<A HREF=\"Y2000/\"><IMG border=\"0\"
> SRC=\"http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2000/\">Y2000</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2001/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2001/\">Y2001</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2002/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2002/\">Y2002</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2003/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2003/\">Y2003</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2004/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2004/\">Y2004</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2005/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2005/\">Y2005</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2006/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2006/\">Y2006</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2007/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2007/\">Y2007</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2008/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2008/\">Y2008</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2009/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2009/\">Y2009</A>. . . . . . . . . . . . . .
> Dec 23 2010 \n<A HREF=\"Y2010/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2010/\">Y2010</A>. . . . . . . . . . . . . .
> Feb 20 2011 \n<A HREF=\"Y2011/\"><IMG border=\"0\" SRC=\"
> http://localhost:3128/squid-internal-static/icons/anthony-dir.gif\"
> ALT=\"[DIR] \"></A> <A HREF=\"Y2011/\">Y2011</A>. . . . . . . . . . . . . .
> Mar 12 2012 \n</PRE>\n<HR noshade
> size=\"1px\">\n<ADDRESS>\nGenerated Wed, 10 Oct 2012 13:43:53 GMT by
> localhost (squid/2.7.STABLE9)\n</ADDRESS></BODY></HTML>\n"
>
> The curious is that the command getURL was working well until I don't know
> what happened. And using the same command in Windows works fine.
>
> The sessionInfo() have given me the next:
>
> R version 2.14.1 (2011-12-22)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> LC_TIME=en_US.UTF-8
> [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
> LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> LC_ADDRESS=C
> [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8
> LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] MODIS_0.5-8 maptools_0.8-16 lattice_0.20-0 foreign_0.8-48
> date_1.2-32
> [6] RCurl_1.95-0.1 bitops_1.0-4.1 rgdal_0.7-19 raster_2.0-12
> sp_0.9-99
>
> loaded via a namespace (and not attached):
> [1] grid_2.14.1 tools_2.14.1
>
> Kind regard for all
>
> Francisco Zambrano Bigiarini
> INIA Quilamapu, Chillán, *Chile*
>
> [[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list