[R] Behaviour of 'source' with URLs and proxy

Renaud Gaujoux renaud at mancala.cbio.uct.ac.za
Wed Oct 5 12:44:12 CEST 2011


 From the help page ?file I -- had -- read the following:

"For ‘url’ the description is a complete URL, including scheme
(such as ‘http://’,ftp://’ or ‘file://’). Proxies can be
specified for HTTP and FTP ‘url’ connections: see ‘download.file’."

 From the internet.info messages it seems that the proxy is actually 
used, but somehow differently than what download.file does (via wget).

Is source supposed to work through a proxy?

-- 
Renaud Gaujoux
Computational Biology - University of Cape Town
South Africa


On 05/10/2011 12:26, Prof Brian Ripley wrote:
> On Wed, 5 Oct 2011, Renaud Gaujoux wrote:
>
>> Hi,
>>
>> I am having troubles sourcing a file from our local network from R.
>> It looks like this file are not properly accessed by 'source', even 
>> they can be downloaded with download.file. (See below my settings and 
>> some tests I did). I ended up with a work around, but I would like to 
>> understand what is going on.
>>
>> Doesn't source/readLines uses the same mechanism as download.file to 
>> access URLs?
>
> No. They use url() connections. See ?file.
>
>>
>> Thank you.
>>
>> Renaud
>>
>> My setting:
>> - I am using R 2.13.2 on Ubuntu 11.04.
>> - I am accessing internet through a proxy (set up with cntlm, not 
>> sure if this is the issue but I don't know how to check without it). 
>> This means that http_proxy='http://localhost:8080/'.
>> - We have local CRNA/BioConductor mirrors that can be accessed 
>> without going through the proxy.
>> - My .Rprofile sources a file 'setrepos.R' on the local network, that 
>> sets all relevant repos to our local mirrors.
>>
>> From the shell:
>> - I can wget any URL (local or internet) from command line without a 
>> problem.
>> - In particular I can wget the file 'setrepos.R' from command line.
>>
>> Symptoms:
>> - with options(download.file.method='wget'), I can download any URL 
>> (local or internet) with download.file
>> - I _cannot_ source any local or internet URL if http_proxy is set. 
>> It simply freezes. Using internet.info=0 gives the following messages:
>> ############
>> Warning messages:
>> 1: In file(file, "r", encoding = encoding) :
>> using HTTP proxy 'http://localhost:8080/'
>> 2: In file(file, "r", encoding = encoding) :
>> connected to 'localhost' on port 8080.
>> 3: In file(file, "r", encoding = encoding) :
>> -> (Proxy) GET http://*OUR_HOST*/~renaud/R/setrepos.R HTTP/1.0
>> Host: *OUR_HOST*
>> Pragma: no-cache
>> User-Agent: R (2.13.2 x86_64-pc-linux-gnu x86_64 linux-gnu)
>>
>> 4: In file(file, "r", encoding = encoding) : <- HTTP/1.1 200 OK
>> 5: In file(file, "r", encoding = encoding) : <- Via: 1.1 SRVWINTMG004
>> 6: In file(file, "r", encoding = encoding) : <- Connection: Keep-Alive
>> 7: In file(file, "r", encoding = encoding) : <- Proxy-Connection: 
>> Keep-Alive
>> 8: In file(file, "r", encoding = encoding) : <- Content-Length: 1597
>> 9: In file(file, "r", encoding = encoding) :
>> <- Date: Wed, 05 Oct 2011 06:43:13 GMT
>> 10: In file(file, "r", encoding = encoding) : <- Content-Type: 
>> text/plain
>> 11: In file(file, "r", encoding = encoding) :
>> <- ETag: "30b8018-63d-4a627b821c980"
>> 12: In file(file, "r", encoding = encoding) :
>> <- Server: Apache/2.2.9 (Ubuntu) DAV/2 SVN/1.5.1 PHP/5.2.6-2ubuntu4.6 
>> with Suhosin-Patch mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.9 
>> OpenSSL/0.9.8g mod_perl/2.0.4 Perl/v5.10.0
>> 13: In file(file, "r", encoding = encoding) : <- Accept-Ranges: bytes
>> 14: In file(file, "r", encoding = encoding) :
>> <- Last-Modified: Mon, 20 Jun 2011 17:03:50 GMT
>> 15: In file(file, "r", encoding = encoding) : Code 200, content-type 
>> 'text/plain'
>> ############
>>
>> - Setting options(download.file.method='wget') before sourcing does 
>> not change the behaviour.
>> - However, I can source any local URL if http_proxy='', without 
>> changing download.file.method. But then download.file does not work 
>> for internet URL any more since the proxy settings are wrong. I could 
>> set http_proxy='', then source, then restore the proxy settings and 
>> set options(download.file.method='wget'). But this is just a work 
>> around and I would like to understand what is going on.
>>
>> Session Info:
>>
>> R version 2.13.2 (2011-09-30)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_ZA.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_ZA.UTF-8 LC_COLLATE=en_ZA.UTF-8
>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=en_ZA.UTF-8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_ZA.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] devtools_0.4
>>
>> loaded via a namespace (and not attached):
>> [1] RCurl_1.6-10 tools_2.13.2
>>
>>
>>
>>
>> -- 
>>
>> Renaud Gaujoux
>> Computational Biology - University of Cape Town
>> South Africa
>>
>>
>>
>>
>> ###
>>
>> UNIVERSITY OF CAPE TOWN This e-mail is subject to the UCT ICT 
>> policies and e-mai...{{dropped:5}}
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



###

UNIVERSITY OF CAPE TOWN 

This e-mail is subject to the UCT ICT policies and e-mai...{{dropped:5}}



More information about the R-help mailing list