[R] ftp fetch using RCurl?

Duncan Temple Lang duncan at wald.ucdavis.edu
Fri Feb 27 04:54:58 CET 2009


Hi Stanley.

CHD850 wrote:
> Hi everyone,
> 
> I have to fetch about 300 to 500 zipped archives from a remote ftp server.
> Each of the archive is about 1Mb. I know I can get it done by using
> download.file() in R, but I am curious that is there a faster way to do this
> using RCurl. For example, are there some parameters that I can set so that
> the connection does not need to be rebuilt....etc. 

Yes, curl can keep connections alive. One can create a curl handle with

  h = getCurlHandle()

and then use this in subsequent, related calls, e.g.

   getURLContent("ftp://....", curl = curl)

Keeping the connection alive is more common in HTTP and can be done
explicitly by specifying

    Connection = "Keep-Alive"

as one of the values for httpheader. But this is for HTTP. For FTP,
I'd have to look up the relevant curl options.

In addition to using a single handle across multiple calls, one
can use the multi-curl interface within RCurl which allows one
to make many asynchronous requests and process them as they reply.
This can often be be faster than the same number of requests done
sequentially.

> 
> A even simpler question is, how can I fetch an archive from the server and
> place it somewhere locally? I have spent a lot of time reading RCurl
> documents and curl web pages but in vain. Can someone show me an example of
> the syntax? Pardon me if this is trivial to you.

I would use something like

   content = getURLContent("ftp://...../foo.zip")

   attributes(content) = NULL

   writeBin(content, "/tmp/foo.zip")

and that should be sufficient.

(You have to strip the attributes or writeBin() complains.)


> 
> Thanks
> Stanley
>




More information about the R-help mailing list