[R-SIG-Finance] Process used to manage workspace and large data files

s algotr8der at gmail.com
Sat May 14 15:54:30 CEST 2011


On 5/13/11 10:08 PM, Brian G. Peterson wrote:
> We stored tick data as binary xts (not XTS, R is case sensitive)
> rda/RData objects on disk.  This was more than fast enough, and far
> faster than MySQL.
Thank you Brian.

I tried this and I was successful in saving an xts object and retrieving
it using getSymbols(Symbols, src='rda').

Now how do you maintain these rda files as new data arrives each day -
is there some way of adding incremental data to the object? For example
if I have minute data for QQQ from 2002-01-01 to 2011-05-13 I don't want
to pull all of the data beginning 2002-01-01 from my source to update
one day's worth of data at the end of business this coming Monday.


>    - Brian
>
> On Fri, 2011-05-13 at 18:48 -0700, algotr8der wrote:
>> I think I have been working inefficiently with how I manage my workspace and
>> large data files. I was working with daily stock price data so my
>> inefficiencies were manageable but now I have moved to intraday data and
>> need to optimize. As such, I installed mySQL db on the same machine I
>> operate R to store the intraday data (minute frequency). 
>>
>> However, I find that loading minute stock data using dbGetQuery is very
>> slow. I have read several comments here on using RData files or
>> getSymbols.mySQL but would greatly appreciate further insight.
>>
>> The structure of the table in mySQL that stores the minute data is as
>> follows:
>>
>> date, time, open, high, low, close, volume
>>
>> Every time I load data into R I have to perform manipulations on the date
>> and time column to get the data into the right format before I can generate
>> an XTS object of the data. I'm wondering whether I need to combine the date
>> and time columns in mySQL into one date column with the format as follows so
>> that I can use getSymbols.mySQL:
>>
>> %m/%d/%Y %h:%i:%s %p
>>
>> It seems to me that getSymbols.mySQL would need the date to be in the
>> aforementioned format as otherwise how else is it going to produce an XTS
>> object if time is stored in a separate column. 
>>
>> As a note - I am the only one who accesses the data.
>>
>> I would greatly appreciate it if you could share your experiences and
>> possibly your data/workspace set up.
>>
>>
>>
>> --
>> View this message in context: http://r.789695.n4.nabble.com/Process-used-to-manage-workspace-and-large-data-files-tp3521586p3521586.html
>> Sent from the Rmetrics mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> R-SIG-Finance at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions should go.



More information about the R-SIG-Finance mailing list