[R] Downloading data from from internet
cls59
chuck at sharpsteen.net
Thu Sep 24 16:27:57 CEST 2009
Bogaso wrote:
>
> Hi all,
>
> I want to download data from those two different sources, directly into R
> :
>
> http://www.rateinflation.com/consumer-price-index/usa-cpi.php
> http://eaindustry.nic.in/asp2/list_d.asp
>
> First one is CPI of US and 2nd one is WPI of India. Can anyone please give
> any clue how to download them directly into R. I want to make them zoo
> object for further analysis.
>
> Thanks,
>
The following site did not load for me:
http://eaindustry.nic.in/asp2/list_d.asp
But I was able to extract the table from the US CPI site using Duncan Temple
Lang's XML package:
library(XML)
First, download the website into R:
html.raw <- readLines(
'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' )
Then, convert to an HTML object using the XML package:
html.data <- htmlTreeParse( html.raw, asText = T, useInternalNodes = T )
A quick scan of the page source in the browser reveals that the table you
want is encased in a div with a class of "dynamicContent"-- we will use a
xpath specification[1] to retrieve all rows in that table:
table.html <- getNodeSet( html.data,
'//div[@class="dynamicContent"]/table/tr' )
Now, the data values can be extracted from the cells in the rows using a
little sapply and xpathXpply voodoo:
table.data <- t( sapply( table.html, function( row ){
row.data <- xpathSApply( row, './td', xmlValue )
return( row.data)
}))
Good luck!
-Charlie
[1]: http://www.w3schools.com/XPath/xpath_syntax.asp
-----
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
--
View this message in context: http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25572316.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list