[R-SIG-Finance] RMySQL returning too much data?

Mark Knecht markknecht at gmail.com
Tue Jan 7 17:00:41 CET 2014


Using the code attached it appears to me that RMySQL is not paying
attention to the from=/to= date ranges the way getSymbols does when
going to Yahoo. WIth Yahoo I get the range I requested. With MySQL I
seem to get whatever range I have stored in the MySQL database.

When I run with UseMySQL = FALSE I get these results:

> index(tempSPY)[1]
[1] "2010-01-04"
> index(tempSPY)[nrow(tempSPY)]
[1] "2011-12-30"
> index(tempAGG)[1]
[1] "2010-01-04"
> index(tempAGG)[nrow(tempAGG)]
[1] "2011-12-30"
>
> tempSPY = subset(tempSPY, index(tempSPY)>= Date_Start)
> tempAGG = subset(tempAGG, index(tempAGG)>= Date_Start)
>
> index(tempSPY)[1]
[1] "2010-01-04"
> index(tempAGG)[1]
[1] "2010-01-04"

>


When I run with UseMySQL = TRUE I get these results:

> index(tempSPY)[1]
[1] "2000-01-03"
> index(tempSPY)[nrow(tempSPY)]
[1] "2014-01-06"
> index(tempAGG)[1]
[1] "2003-09-29"
> index(tempAGG)[nrow(tempAGG)]
[1] "2014-01-06"
>
> tempSPY = subset(tempSPY, index(tempSPY)>= Date_Start)
> tempAGG = subset(tempAGG, index(tempAGG)>= Date_Start)
>
> index(tempSPY)[1]
[1] "2010-01-04"
> index(tempAGG)[1]
[1] "2010-01-04"

>


I wonder if this is a bug in RMySQL or have I possibly stored my dates
or something else incorrectly in MySQL and somehow that's getting past
whatever range checking RMySQL expects to do? Here's what the end of
the data looks like for SPY taken from a terminal:

| 2013-12-24 | 182.54 | 183.01 | 182.53 | 182.93 |     45368800 | 182.93 |
| 2013-12-26 | 183.34 | 183.96 | 183.32 | 183.86 |     63365000 | 183.86 |
| 2013-12-27 |  184.1 | 184.18 | 183.66 | 183.85 |     61814000 | 183.85 |
| 2013-12-30 | 183.87 | 184.02 | 183.58 | 183.82 |     56817500 | 183.82 |
| 2013-12-31 | 184.07 | 184.69 | 183.93 | 184.69 |     86119900 | 184.69 |
| 2014-01-02 | 183.98 | 184.07 | 182.48 | 182.92 |    119364600 | 182.92 |
| 2014-01-03 | 183.23 |  183.6 | 182.63 | 182.89 |     81330600 | 182.89 |
| 2014-01-06 | 183.49 | 183.56 | 182.08 | 182.36 |    106828500 | 182.36 |
+------------+--------+--------+--------+--------+--------------+--------+
3524 rows in set (0.02 sec)

In MySQL Workbench I see this info on the table for SPY:

Table: SPY
Columns:
Date date
o double
h double
l double
c double
v double
a double


Thanks in advance,
Mark




library(blotter)
library(quantmod)
library(RMySQL)

Date_Start = "2010-01-01"
#Date_End = format(Sys.time(), "%Y-%m-%d")
Date_End = "2012-01-01"
SymbolSrc = "yahoo"
UseMySQL = FALSE
dbName = "Daily"
MyAdjust = TRUE
currency("USD")
InitialEquity = 100000

SymbolList = c("SPY","AGG")
NumSymbols = length(SymbolList)

if (UseMySQL){
  if (MyAdjust) { dbName = paste0(dbName, "Adjust")}
  dbc = dbConnect(MySQL(), dbname=dbName)
  setDefaults(getSymbols.MySQL, user="XXXX", password="YYYY", dbname=dbName)
}

if (!UseMySQL){
  for (i in 1:NumSymbols){
    print(paste("From ",SymbolSrc," -- ",SymbolList[i]))
    getSymbols(SymbolList[i], from = Date_Start, to = Date_End, adjust
= MyAdjust, src=SymbolSrc)
  }
} else {
  for (i in 1:NumSymbols){
    print(paste("From MySQL -- ",SymbolList[i]))
    getSymbols(SymbolList[i], from = Date_Start, to = Date_End, src="MySQL")
  }
}

tempSPY =get("SPY")
tempAGG =get("AGG")

index(tempSPY)[1]
index(tempSPY)[nrow(tempSPY)]
index(tempAGG)[1]
index(tempAGG)[nrow(tempAGG)]

tempSPY = subset(tempSPY, index(tempSPY)>= Date_Start)
tempAGG = subset(tempAGG, index(tempAGG)>= Date_Start)

index(tempSPY)[1]
index(tempAGG)[1]



More information about the R-SIG-Finance mailing list