[R] Finding the right url for RCurl
Brian Diggs
diggsb at ohsu.edu
Thu Aug 5 19:32:06 CEST 2010
On 8/4/2010 2:07 PM, AndrewPage wrote:
>
> Hi all,
>
> I am using RCurl to try and download data from a website, but I'm having
> trouble finding out what URL to use. Here is the site:
>
> http://www.invescopowershares.com/products/holdings.aspx?ticker=PGX
>
> See how in the upper right, above the displayed sheet, there's a link to
> download the data as a .csv file? When I hit "copy url" and paste into
> getURL in R, it doesn't work. That's no surprise because there isn't a URL
> in what gets pasted. I was just wondering if there's any way around this.
>
> Thanks in advance,
>
> Andrew
I looked at the page. The link you mentioned runs some javascript which
alters some values in a form and posts that form, the result of which is
the CSV file. There is not a simple URL that points to the file. I
don't know if RCurl can post forms, but if it can you may be able to
mimic the form. The structure of the form starts on line 191 of the
page source (or search for "aspnetForm") and appropriate values for
__EVENTTARGET are given in the doPostBack call on line 258. Some
understanding of HTML and HTTP may be necessary to know what is going on.
I don't know if this would work or not. Also, the site has not made it
easy to directly download the CSV file. That may be intentional. The
Terms & Services of the site may have something to say about doing this
as well.
--
Brian Diggs
Senior Research Associate, Department of Surgery, Oregon Health &
Science University
More information about the R-help
mailing list