[R-SIG-Finance] yahoo dates
G See
gsee000 at gmail.com
Thu Jul 12 16:06:35 CEST 2012
The workaround for get.hist.quote sounds like a good one, and I would
love to see something similar added to getSymbols.yahoo.
I did not mean to suggest that there's not a problem; the point I was
trying to get at is that I think it only occurs with historical data,
not "current" data. So, the user could record their own data. I
realize that's a little off topic to the question though.
To test my theory, I've been "streaming" quotes from yahoo for the
past couple days by repeatedly requesting current quotes. I've found
that the extra volume never appears using this method. When the
market closes, the Volume stops changing. When the market opens, the
volume jumps to zero (or close to it). So, one workaround might be to
replace the last day (and it's duplicate if there is a duplicate) with
the data returned by `getQuote`
For reference, here's the code I used to collect data. Although, it
probably shouldn't be used because Yahoo probably doesn't like folks
hitting their server this hard.
library(quantmod)
filename <- "~/GSPCintra.csv"
file.create(filename)
# Add headers
cat(paste0("Sys.time,", paste(make.names(colnames(getQuote("^GSPC"))),
collapse=","), "\n"), file=filename)
# record data; break with ctrl-c
while(TRUE) {
try(cat(paste0(Sys.time(), ",", paste(getQuote("^GSPC")[1, ], collapse=","),
"\n"), file=filename, append=TRUE))
}
# retrieve
tmp <- read.table(filename, stringsAsFactors=FALSE, sep=",", header=TRUE)
x <- xts(tmp[, c(6:8, 3, 9)], as.POSIXct(tmp[, 1]))
Cheers,
Garrett
On Tue, Jul 10, 2012 at 4:56 AM, Achim Zeileis <Achim.Zeileis at uibk.ac.at> wrote:
> On Mon, 9 Jul 2012, G See wrote:
>
>> FWIW, I download 33 fields from yahoo every night at 10 p.m. CDT using
>
>
> I'm not sure but maybe that is still too early. The problem is real and
> occurs for me "now" (around 10:00 GMT), both for individual stocks and
> indexes:
>
> R> library("quantmod")
> R> getSymbols(c("^GSPC", "IBM"))
> R> tail(GSPC, 3)
> GSPC.Open GSPC.High GSPC.Low GSPC.Close GSPC.Volume GSPC.Adjusted
> 2012-07-06 1367.09 1367.09 1348.03 1354.68 2745140000 1354.68
> 2012-07-09 1354.66 1354.87 1346.65 1352.51 399252300 1352.46
> 2012-07-09 1354.66 1354.87 1346.65 1352.46 2904860000 1352.46
> R> tail(IBM, 3)
> IBM.Open IBM.High IBM.Low IBM.Close IBM.Volume IBM.Adjusted
> 2012-07-06 193.92 193.94 189.74 191.41 4952900 191.41
> 2012-07-09 190.66 191.00 188.05 189.65 3569800 189.67
> 2012-07-09 190.76 191.00 188.05 189.67 3988100 189.67
>
> Note the last line in both cases, especially the volume. The same is
> visible, of course, at the Yahoo! Finance site:
>
> http://finance.yahoo.com/q/hp?s=^GSPC+Historical+Prices
> http://finance.yahoo.com/q/hp?s=IBM+Historical+Prices
>
> Users of Yahoo! Finance also complained about this in the user forum. But as
> nobody could offer a good explanation for this, we implemented a patch in
> tseries' get.hist.quote() that omits the last observation in case its date
> is dupblicated:
>
This sounds like something quantmod should consider doing
> R> library("tseries")
> R> tail(get.hist.quote("IBM"), 3)
> Open High Low Close
> 2012-07-05 194.88 196.85 193.63 195.29
> 2012-07-06 193.92 193.94 189.74 191.41
> 2012-07-09 190.76 191.00 188.05 189.67
> Warning message:
> In get.hist.quote("IBM") : first date duplicated, first instance omitted
>
> Best,
>
> Z
>
More information about the R-SIG-Finance
mailing list