[R] Parsing Google Finance page data?
Matt Considine
matt at considine.net
Fri Nov 21 03:45:14 CET 2014
FWIW, this is the kludge I came up with. The idea is that I only know
the name of the company and not the ticker/exchange. So the following
admittedly doesn't work in all cases (e.g. "Time Warner"). So if anyone
alternatively knows how to return a list of tickers/exchanges of
companies matching a name, that would be helpful. (Though that question
should probably go to the finance list). In any case, thanks in advance
for any thoughts put towards this.
Matt
library(RCurl)
library(xts)
library(XML)
#want to return results of this
# http://www.google.com/finance?q=ibm
coname <- "ibm"
baseurl <-paste("http://www.google.com/finance?q=",coname,sep="")
# Read and parse HTML file
doc.html = htmlTreeParse(baseurl, useInternalNodes=TRUE)
tables <-
readHTMLTable(doc.html,which=2,as.data.frame=T,stringsAsFactors = FALSE)
mktcap <- tables[4,2]
doc.text = unlist(xpathApply(doc.html, '//script', xmlValue))
block <- doc.text[11]
exchangeticker<-unlist(strsplit(block,'\n'))[11]
doc.text = unlist(xpathApply(doc.html, '//div', xmlValue))
currency <- doc.text[60]
print(mktcap)
print(exchangeticker)
print(currency)
---
This email is free from viruses and malware because avast! Antivirus protection is active.
[[alternative HTML version deleted]]
More information about the R-help
mailing list