[R-SIG-Finance] yahooKeystats broken?
jctoll
jctoll at gmail.com
Sun Jan 2 03:19:28 CET 2011
On Thu, Dec 23, 2010 at 9:29 AM, jctoll <jctoll at gmail.com> wrote:
> Hi,
>
> I've been trying to use the yahooKeystats function from fImport, but I
> think it may be broken.
>
>> yahooKeystats("IBM", try=FALSE)
> trying URL 'http://finance.yahoo.com/q/ks?s=IBM'
> Content type 'text/html; charset=utf-8' length unknown
> opened URL
> .......... .......... .......... ...
> downloaded 33 Kb
>
> Read 151 items
> Error in matrix(unlist(strsplit(x, "@")), byrow = TRUE, ncol = 2) :
> attempt to set an attribute on NULL
>
> I tried to check out the code and the problem is that x is basically
> empty when it gets passed to matrix().
>
> I think the problem is with this earlier section of code:
>
> if (length(Index) > 0)
> x = x[-Index]
>
> I'm not sure what they were trying to do with that code, but this
> looks like the problem.
>
> I was hoping someone could confirm whether yahooKeystats is working or
> not for them.
>
> Thanks,
>
>
> James
>
>
>
>> sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] graphics grDevices utils datasets stats methods base
>
> other attached packages:
> [1] fImport_2110.79 timeSeries_2130.90
> timeDate_2130.91 PerformanceAnalytics_1.0.3.2
> [5] quantmod_0.3-15 Defaults_1.1-1
> TTR_0.20-2 xts_0.7-5
> [9] zoo_1.6-4
>
> loaded via a namespace (and not attached):
> [1] grid_2.12.1 lattice_0.19-13
>
Hi,
In case anyone else needs to use yahooKeystats, I am including the
edited code I used to get it working. It appears Yahoo has changed
their page format again. Additionally, there is a typographical error
within the html that requires a temporary work-around until it is
fixed. It's not pretty but it works for me.
James
function (query, file = "tempfile", source = NULL, save = FALSE,
try = TRUE)
{
if (is.null(source))
source = "http://finance.yahoo.com/q/ks?s="
if (try) {
z = try(yahooKeystats(query, file, source, save, try = FALSE))
if (class(z) == "try-error" || class(z) == "Error") {
return("No Internet Access")
}
else {
return(z)
}
}
else {
url = paste(source, query, sep = "")
download.file(url = url, destfile = file)
x = scan(file, what = "", sep = "\n")
x = x[grep("datamodoutline1", x)]
if (!length(x))
return(NA)
x = gsub("/", "", x, perl = TRUE)
x = gsub(" class=.yfnc_\\w+.", "", x, perl = TRUE)
# added
x = gsub(" colspan=.2.", "", x, perl = TRUE)
x = gsub(" cell.......=...", "", x, perl = TRUE)
x = gsub(" border=...", "", x, perl = TRUE)
x = gsub(" align=.\\w+.", "", x, perl = TRUE)
# added
x = gsub(".nbsp.", "", x, perl = TRUE)
# added
x = gsub("Financial Highlights", "", x, perl = TRUE)
# added
x = gsub(" style=.display.block. height.10px.", "", x, perl =
TRUE) # added
x = gsub(" id=.yfs_j10_\\w+.", "", x, perl = TRUE)
# added
x = gsub(" width=.74%.>", "", x, perl = TRUE)
# edited
x = gsub(" width=.100%.", "", x, perl = TRUE)
x = gsub(" size=.-1.", "", x, perl = TRUE)
x = gsub("<.>", "", x, perl = TRUE)
x = gsub("<..>", "", x, perl = TRUE)
x = gsub("<....>", "", x, perl = TRUE)
x = gsub("<table>", "", x, perl = TRUE)
x = gsub("<sup>.<sup>", "", x, perl = TRUE)
x = gsub("&", "&", x, perl = TRUE)
x = gsub("<td", " @ ", x, perl = TRUE)
x = gsub(",", "", x, perl = TRUE)
x = unlist(strsplit(x, "@"))
x = x[grep(":", x)]
x = gsub("^ ", "", x, perl = TRUE)
Index = grep("^ ", x)
if (length(Index) > 0)
x = x[-Index]
x = gsub(" $", "", x, perl = TRUE)
x = gsub(":$", ":NA", x, perl = TRUE)
x = sub(":", "@", x)
x = sub(":", "/", x)
x = matrix(unlist(strsplit(x, "@")), byrow = TRUE, ncol = 2)
stats = as.character(Sys.Date())
x = rbind(c("Symbol", query), c("Date", stats), x)
x[53,1] <- "Trailing Annual Dividend Rate"
# added temporary kludge to fix yahoo typo
X = as.data.frame(x[, 2])
rownames(X) = x[, 1]
colnames(X) = "Value"
}
X
}
More information about the R-SIG-Finance
mailing list