[R-SIG-Finance] RMySQL returning too much data?

Joshua Ulrich josh.m.ulrich at gmail.com
Tue Jan 7 18:16:36 CET 2014


Hi Mark,

On Tue, Jan 7, 2014 at 10:00 AM, Mark Knecht <markknecht at gmail.com> wrote:
> Using the code attached it appears to me that RMySQL is not paying
> attention to the from=/to= date ranges the way getSymbols does when
> going to Yahoo. WIth Yahoo I get the range I requested. With MySQL I
> seem to get whatever range I have stored in the MySQL database.
>
getSymbols.MySQL doesn't have from or to args.

> args(getSymbols.MySQL)
function (Symbols, env, return.class = "xts", db.fields = c("date",
    "o", "h", "l", "c", "v", "a"), field.names = NULL, user = NULL,
    password = NULL, dbname = NULL, host = "localhost", port = 3306,
    ...)

While they could potentially be added; it's probably more trouble than
it's worth in this case, since your data aren't that large and you can
subset after the getSymbols call returns:
foo[paste0(Date_Start,"/",Date_end)].

> When I run with UseMySQL = FALSE I get these results:
>
>> index(tempSPY)[1]
> [1] "2010-01-04"
>> index(tempSPY)[nrow(tempSPY)]
> [1] "2011-12-30"
>> index(tempAGG)[1]
> [1] "2010-01-04"
>> index(tempAGG)[nrow(tempAGG)]
> [1] "2011-12-30"
>>
>> tempSPY = subset(tempSPY, index(tempSPY)>= Date_Start)
>> tempAGG = subset(tempAGG, index(tempAGG)>= Date_Start)
>>
>> index(tempSPY)[1]
> [1] "2010-01-04"
>> index(tempAGG)[1]
> [1] "2010-01-04"
>
>>
>
>
> When I run with UseMySQL = TRUE I get these results:
>
>> index(tempSPY)[1]
> [1] "2000-01-03"
>> index(tempSPY)[nrow(tempSPY)]
> [1] "2014-01-06"
>> index(tempAGG)[1]
> [1] "2003-09-29"
>> index(tempAGG)[nrow(tempAGG)]
> [1] "2014-01-06"
>>
>> tempSPY = subset(tempSPY, index(tempSPY)>= Date_Start)
>> tempAGG = subset(tempAGG, index(tempAGG)>= Date_Start)
>>
>> index(tempSPY)[1]
> [1] "2010-01-04"
>> index(tempAGG)[1]
> [1] "2010-01-04"
>
>>
>
>
> I wonder if this is a bug in RMySQL or have I possibly stored my dates
> or something else incorrectly in MySQL and somehow that's getting past
> whatever range checking RMySQL expects to do? Here's what the end of
> the data looks like for SPY taken from a terminal:
>
> | 2013-12-24 | 182.54 | 183.01 | 182.53 | 182.93 |     45368800 | 182.93 |
> | 2013-12-26 | 183.34 | 183.96 | 183.32 | 183.86 |     63365000 | 183.86 |
> | 2013-12-27 |  184.1 | 184.18 | 183.66 | 183.85 |     61814000 | 183.85 |
> | 2013-12-30 | 183.87 | 184.02 | 183.58 | 183.82 |     56817500 | 183.82 |
> | 2013-12-31 | 184.07 | 184.69 | 183.93 | 184.69 |     86119900 | 184.69 |
> | 2014-01-02 | 183.98 | 184.07 | 182.48 | 182.92 |    119364600 | 182.92 |
> | 2014-01-03 | 183.23 |  183.6 | 182.63 | 182.89 |     81330600 | 182.89 |
> | 2014-01-06 | 183.49 | 183.56 | 182.08 | 182.36 |    106828500 | 182.36 |
> +------------+--------+--------+--------+--------+--------------+--------+
> 3524 rows in set (0.02 sec)
>
> In MySQL Workbench I see this info on the table for SPY:
>
> Table: SPY
> Columns:
> Date date
> o double
> h double
> l double
> c double
> v double
> a double
>
>
> Thanks in advance,
> Mark
>
>
>
>
> library(blotter)
> library(quantmod)
> library(RMySQL)
>
> Date_Start = "2010-01-01"
> #Date_End = format(Sys.time(), "%Y-%m-%d")
> Date_End = "2012-01-01"
> SymbolSrc = "yahoo"
> UseMySQL = FALSE
> dbName = "Daily"
> MyAdjust = TRUE
> currency("USD")
> InitialEquity = 100000
>
> SymbolList = c("SPY","AGG")
> NumSymbols = length(SymbolList)
>
> if (UseMySQL){
>   if (MyAdjust) { dbName = paste0(dbName, "Adjust")}
>   dbc = dbConnect(MySQL(), dbname=dbName)
>   setDefaults(getSymbols.MySQL, user="XXXX", password="YYYY", dbname=dbName)
> }
>
> if (!UseMySQL){
>   for (i in 1:NumSymbols){
>     print(paste("From ",SymbolSrc," -- ",SymbolList[i]))
>     getSymbols(SymbolList[i], from = Date_Start, to = Date_End, adjust
> = MyAdjust, src=SymbolSrc)
>   }
> } else {
>   for (i in 1:NumSymbols){
>     print(paste("From MySQL -- ",SymbolList[i]))
>     getSymbols(SymbolList[i], from = Date_Start, to = Date_End, src="MySQL")
>   }
> }
>
> tempSPY =get("SPY")
> tempAGG =get("AGG")
>
> index(tempSPY)[1]
> index(tempSPY)[nrow(tempSPY)]
> index(tempAGG)[1]
> index(tempAGG)[nrow(tempAGG)]
>
> tempSPY = subset(tempSPY, index(tempSPY)>= Date_Start)
> tempAGG = subset(tempAGG, index(tempAGG)>= Date_Start)
>
> index(tempSPY)[1]
> index(tempAGG)[1]
>

--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com



More information about the R-SIG-Finance mailing list