[Rd] [R] HTTP User-Agent header
Robert Gentleman
rgentlem at fhcrc.org
Fri Jul 28 21:52:27 CEST 2006
OK, that suggests setting at the options level would solve both of your
problems and that seems like the best approach. I don't really want to
pass this around as a parameter through the maze of functions that might
actually download something if we don't have to.
I think we can provide something early next week on R-devel for folks to
test. But I suspect that as Henrik also does, the set of sites that will
refuse us with a User-Agent header will be much larger than those that
James has found that refuse us without it.
best wishes
Robert
Henrik Bengtsson wrote:
> On 7/28/06, Robert Gentleman <rgentlem at fhcrc.org> wrote:
>> I wonder if it would not be better to make the user agent string
>> something that is configurable (at the time R is built) rather than at
>> run time. This would make Seth's patch about 1% as long. Or this could
>> be handled as an option. The patches are pretty extensive and allow for
>> setting the agent header by setting parameters in function calls (eg
>> download.files). I am not sure there is a good use case for that level
>> of flexibility and the additional code is substantial.
>>
>>
>> The issue that I think arises is that there are potentially other
>> systems that will be unhappy with R's identification of itself and so
>> some users may also need to turn it off.
>>
>> Any strong opinions?
>
> Actually two:
>
> 1) If you wish to pull down (read extract from HTML or similar) live
> data from the web, you might want to be able to "immitate" a certain
> browser. For instance, if you tell some webserver you're a simple
> "mobile phone" or "lynx", you might be able get back very clean data.
> Some servers might also block unknown web browsers.
>
> 2) If the webserver of a package reprocitory decided to make use of
> the user-agent string to decide what version of the reprocitory it
> should deliver, I would like to be able to trick the server. Why?
> Many times I found myself working on a system where I do not have the
> rights to update to the latest or the developers version of R.
> However, although I have not the very latest version of R you can do
> work. For instance, in Bioconductor the biocLite() & co gives you
> either the stable or the developers of Bioconductor depending on your
> R version, but looking into the biocLite() code and beyond, you find
> that you actually can install a Bioconductor v1.9 package in R v2.3.1.
> It can be risky business, but if you know what you're doing, it can
> save your day (or week).
>
> Cheers
>
> Henrik
>
>>
>> James P. Howard, II wrote:
>>> On 7/28/06, Seth Falcon <sfalcon at fhcrc.org> wrote:
>>>
>>>> I have a rough draft patch, see below, that adds a User-Agent header
>>>> to HTTP requests made in R via download.file. If there is interest, I
>>>> will polish it.
>>> It looks right, but I am running under Windows without a compiler.
>>>
>> --
>> Robert Gentleman, PhD
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M2-B876
>> PO Box 19024
>> Seattle, Washington 98109-1024
>> 206-667-7700
>> rgentlem at fhcrc.org
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org
More information about the R-devel
mailing list