[R] File Downloading Problem
Santosh Srinivas
santosh.srinivas at gmail.com
Mon Nov 1 18:33:51 CET 2010
Thanks Duncan and Alex.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Duncan Temple Lang
Sent: 01 November 2010 22:34
To: r-help at r-project.org
Subject: Re: [R] File Downloading Problem
I got this working almost immediately with RCurl although with that
one has to specify any value for the useragent option, or the same error
occurs.
The issue is that R does not add an Accept entry to the HTTP request header.
It should add something like
Accept: *.*
Using RCurl,
u =
"http://www.nseindia.com/content/historical/EQUITIES/2010/NOV/cm01NOV2010bha
v.csv.zip"
o = getURLContent(u, verbose = TRUE, useragent =
getOption("HTTPUserAgent"))
succeeds (but not if there is no useragent).
We could fix R's download.file() to send Accept: *.*,
or allow general headers to be specified either as an option for
all requests, or as a parameter of download.file() (or both).
Or we could have the makeUserAgent() function in utils be more customizable
through options, or allow the R user specify the function herself.
But while this would be good, the HTTP facilities in R are not
intended to be as general something like libcurl (and hence RCurl).
Unless there is a compelling reason to enhance R's internal facilities,
I suggest people use something like libcurl. This approach also has
the advantage of having the data directly in memory and avoiding writing
it to disk and then reading it back in, e.g.
library(Rcompression)
z = zipArchive(o)
names(z)
read.csv(textConnection(z[[1]]))
D.
On 11/1/10 8:27 AM, Santosh Srinivas wrote:
> It's strange and the internet connection is fine because I am able to get
> data from yahoo.
> This was working till just yesterday ... strange if the website is
creating
> issues with public access of basic data!
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: 01 November 2010 20:48
> To: Duncan Murdoch
> Cc: Santosh Srinivas; 'Rhelp'
> Subject: Re: [R] File Downloading Problem
>
>
> On Nov 1, 2010, at 10:41 AM, Duncan Murdoch wrote:
>
>> On 01/11/2010 10:37 AM, Santosh Srinivas wrote:
>>> Nope Duncan ... no changes .. the same old way without a proxy ...
>>> actually
>>> the download.file is being returned "403 forbidden" which is strange.
>>>
>>> These are just two lines that I am trying to run.
>>>
>>> sURL<-
>>>
>
"http://www.nseindia.com/content/historical/EQUITIES/2010/NOV/cm01NOV2010bha
>>> v.csv.zip"
>>> download.file(sURL,"test.zip")
>>>
>>> Put the same URL in a browser and it works fine.
>>
>> It doesn't work for me, so presumably there is some kind of security
>> setting at the site (a cookie?), which allows your browser, but
>> doesn't allow you to use R, or me to use anything.
>
> Firefox in a Mac platform will download and unzip the file with no
> security complaints and no cookie appears to be set when downloading,
> but that code will not access the file, nor will my efforts to wrap
> the URL in url() or unz() so it seems more likely that Santosh and I
> do not understand the file opening processes that R supports.
>
> > con=
>
unz(description="http://www.nseindia.com/content/historical/EQUITIES/2010/NO
> V/cm01NOV2010bhav.csv.zip
> ", file="~/cm01NOV2010bhav.csv")
> > test.df <- read.csv(file=con)
> Error in open.connection(file, "rt") : cannot open the connection
> In addition: Warning message:
> In open.connection(file, "rt") :
> cannot open zip file
>
'http://www.nseindia.com/content/historical/EQUITIES/2010/NOV/cm01NOV2010bha
> v.csv.zip'
>
>
>
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list