[R] Behaviour of 'source' with URLs and proxy
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Oct 5 13:45:58 CEST 2011
On Wed, 5 Oct 2011, Renaud Gaujoux wrote:
> From the help page ?file I -- had -- read the following:
>
> "For ‘url’ the description is a complete URL, including scheme
> (such as ‘http://’, ‘ftp://’ or ‘file://’). Proxies can be
> specified for HTTP and FTP ‘url’ connections: see ‘download.file’."
So you should have known that it was the same as url()!
> From the internet.info messages it seems that the proxy is actually used, but
> somehow differently than what download.file does (via wget).
No, somewhat differently than *wget* does. As that help page says,
the section on proxies only refers to the internal method.
> Is source supposed to work through a proxy?
Yes, and it has been tested to do so. But not tested on your proxy ....
>
> --
> Renaud Gaujoux
> Computational Biology - University of Cape Town
> South Africa
>
>
> On 05/10/2011 12:26, Prof Brian Ripley wrote:
>> On Wed, 5 Oct 2011, Renaud Gaujoux wrote:
>>
>>> Hi,
>>>
>>> I am having troubles sourcing a file from our local network from R.
>>> It looks like this file are not properly accessed by 'source', even they
>>> can be downloaded with download.file. (See below my settings and some
>>> tests I did). I ended up with a work around, but I would like to
>>> understand what is going on.
>>>
>>> Doesn't source/readLines uses the same mechanism as download.file to
>>> access URLs?
>>
>> No. They use url() connections. See ?file.
>>
>>>
>>> Thank you.
>>>
>>> Renaud
>>>
>>> My setting:
>>> - I am using R 2.13.2 on Ubuntu 11.04.
>>> - I am accessing internet through a proxy (set up with cntlm, not sure if
>>> this is the issue but I don't know how to check without it). This means
>>> that http_proxy='http://localhost:8080/'.
>>> - We have local CRNA/BioConductor mirrors that can be accessed without
>>> going through the proxy.
>>> - My .Rprofile sources a file 'setrepos.R' on the local network, that sets
>>> all relevant repos to our local mirrors.
>>>
>>> From the shell:
>>> - I can wget any URL (local or internet) from command line without a
>>> problem.
>>> - In particular I can wget the file 'setrepos.R' from command line.
>>>
>>> Symptoms:
>>> - with options(download.file.method='wget'), I can download any URL (local
>>> or internet) with download.file
>>> - I _cannot_ source any local or internet URL if http_proxy is set. It
>>> simply freezes. Using internet.info=0 gives the following messages:
>>> ############
>>> Warning messages:
>>> 1: In file(file, "r", encoding = encoding) :
>>> using HTTP proxy 'http://localhost:8080/'
>>> 2: In file(file, "r", encoding = encoding) :
>>> connected to 'localhost' on port 8080.
>>> 3: In file(file, "r", encoding = encoding) :
>>> -> (Proxy) GET http://*OUR_HOST*/~renaud/R/setrepos.R HTTP/1.0
>>> Host: *OUR_HOST*
>>> Pragma: no-cache
>>> User-Agent: R (2.13.2 x86_64-pc-linux-gnu x86_64 linux-gnu)
>>>
>>> 4: In file(file, "r", encoding = encoding) : <- HTTP/1.1 200 OK
>>> 5: In file(file, "r", encoding = encoding) : <- Via: 1.1 SRVWINTMG004
>>> 6: In file(file, "r", encoding = encoding) : <- Connection: Keep-Alive
>>> 7: In file(file, "r", encoding = encoding) : <- Proxy-Connection:
>>> Keep-Alive
>>> 8: In file(file, "r", encoding = encoding) : <- Content-Length: 1597
>>> 9: In file(file, "r", encoding = encoding) :
>>> <- Date: Wed, 05 Oct 2011 06:43:13 GMT
>>> 10: In file(file, "r", encoding = encoding) : <- Content-Type: text/plain
>>> 11: In file(file, "r", encoding = encoding) :
>>> <- ETag: "30b8018-63d-4a627b821c980"
>>> 12: In file(file, "r", encoding = encoding) :
>>> <- Server: Apache/2.2.9 (Ubuntu) DAV/2 SVN/1.5.1 PHP/5.2.6-2ubuntu4.6 with
>>> Suhosin-Patch mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.9 OpenSSL/0.9.8g
>>> mod_perl/2.0.4 Perl/v5.10.0
>>> 13: In file(file, "r", encoding = encoding) : <- Accept-Ranges: bytes
>>> 14: In file(file, "r", encoding = encoding) :
>>> <- Last-Modified: Mon, 20 Jun 2011 17:03:50 GMT
>>> 15: In file(file, "r", encoding = encoding) : Code 200, content-type
>>> 'text/plain'
>>> ############
>>>
>>> - Setting options(download.file.method='wget') before sourcing does not
>>> change the behaviour.
>>> - However, I can source any local URL if http_proxy='', without changing
>>> download.file.method. But then download.file does not work for internet
>>> URL any more since the proxy settings are wrong. I could set
>>> http_proxy='', then source, then restore the proxy settings and set
>>> options(download.file.method='wget'). But this is just a work around and I
>>> would like to understand what is going on.
>>>
>>> Session Info:
>>>
>>> R version 2.13.2 (2011-09-30)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_ZA.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_ZA.UTF-8 LC_COLLATE=en_ZA.UTF-8
>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_ZA.UTF-8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_ZA.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] devtools_0.4
>>>
>>> loaded via a namespace (and not attached):
>>> [1] RCurl_1.6-10 tools_2.13.2
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Renaud Gaujoux
>>> Computational Biology - University of Cape Town
>>> South Africa
>>>
>>>
>>>
>>>
>>> ###
>>>
>>> UNIVERSITY OF CAPE TOWN This e-mail is subject to the UCT ICT policies and
>>> e-mai...{{dropped:5}}
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
>
>
> ###
>
> UNIVERSITY OF CAPE TOWN
> This e-mail is subject to the UCT ICT policies and e-mail disclaimer
> published on our website at
> http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27
> 21 650 9111. This e-mail is intended only for the person(s) to whom it is
> addressed. If the e-mail has reached you in error, please notify the author.
> If you are not the intended recipient of the e-mail you may not use,
> disclose, copy, redirect or print the content. If this e-mail is not related
> to the business of UCT it is sent by the sender in the sender's individual
> capacity.
>
> ###
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list