[R-SIG-Finance] Process used to manage workspace and large data files

algotr8der algotr8der at gmail.com
Sat May 14 03:48:20 CEST 2011

I think I have been working inefficiently with how I manage my workspace and
large data files. I was working with daily stock price data so my
inefficiencies were manageable but now I have moved to intraday data and
need to optimize. As such, I installed mySQL db on the same machine I
operate R to store the intraday data (minute frequency). 

However, I find that loading minute stock data using dbGetQuery is very
slow. I have read several comments here on using RData files or
getSymbols.mySQL but would greatly appreciate further insight.

The structure of the table in mySQL that stores the minute data is as

date, time, open, high, low, close, volume

Every time I load data into R I have to perform manipulations on the date
and time column to get the data into the right format before I can generate
an XTS object of the data. I'm wondering whether I need to combine the date
and time columns in mySQL into one date column with the format as follows so
that I can use getSymbols.mySQL:

%m/%d/%Y %h:%i:%s %p

It seems to me that getSymbols.mySQL would need the date to be in the
aforementioned format as otherwise how else is it going to produce an XTS
object if time is stored in a separate column. 

As a note - I am the only one who accesses the data.

I would greatly appreciate it if you could share your experiences and
possibly your data/workspace set up.

View this message in context: http://r.789695.n4.nabble.com/Process-used-to-manage-workspace-and-large-data-files-tp3521586p3521586.html
Sent from the Rmetrics mailing list archive at Nabble.com.

More information about the R-SIG-Finance mailing list