[R] Downloading data from from internet

cls59 chuck at sharpsteen.net
Thu Sep 24 16:27:57 CEST 2009




Bogaso wrote:
> 
> Hi all,
> 
> I want to download data from those two different sources, directly into R
> :
> 
> http://www.rateinflation.com/consumer-price-index/usa-cpi.php
> http://eaindustry.nic.in/asp2/list_d.asp
> 
> First one is CPI of US and 2nd one is WPI of India. Can anyone please give
> any clue how to download them directly into R. I want to make them zoo
> object for further analysis.
> 
> Thanks,
> 

The following site did not load for me:

http://eaindustry.nic.in/asp2/list_d.asp

But I was able to extract the table from the US CPI site using Duncan Temple
Lang's XML package:

  library(XML)


First, download the website into R:

  html.raw <- readLines(
'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' )

Then, convert to an HTML object using the XML package:

  html.data <- htmlTreeParse( html.raw, asText = T, useInternalNodes = T )

A quick scan of the page source in the browser reveals that the table you
want is encased in a div with a class of "dynamicContent"-- we will use a
xpath specification[1] to retrieve all rows in that table:

  table.html <- getNodeSet( html.data,
'//div[@class="dynamicContent"]/table/tr' )

Now, the data values can be extracted from the cells in the rows using a
little sapply and xpathXpply voodoo:

  table.data <- t( sapply( table.html, function( row ){

    row.data <-  xpathSApply( row, './td', xmlValue )
    return( row.data)

  }))


Good luck!

-Charlie
 
  [1]:  http://www.w3schools.com/XPath/xpath_syntax.asp

-----
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25572316.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list