[R-SIG-Finance] Dealing with large dataset in quantmod
Brian G. Peterson
brian at braverock.com
Thu Jan 12 20:14:00 CET 2012
On Thu, 2012-01-12 at 19:46 +0100, Gabriele Vivinetto [Public address]
wrote:
> Hello to the mailing list.
> I'm a newbie in R, and this is my first post.
> I've evaluated quantmod using EOD data from yahoo, and everything went fine.
> I have a mysql database containing tick by tick data (in a format
> suitable for quantmod getSymbols.MySQL) and I have tried to use these data.
> Using a table with a small subset of the data (1000 rows) there is no
> problem.
> But if I try to use a table with all the record (~6 millions rows), R is
> very slow and memory hungry (simply speaking it crashes all the times
> after loading the data...).
> As a workaround I've modified the getSymbols.MySQL R function to accept
> from= and to= parameters, so the sql SELECT gives to R a desired subset
> of data, but using more than 100k records is a pain.
> Someone has a workaround or suggestions for using large datasets with
> quantmod ?
>
> Thank you !
I routinely use xts on tick data series with tens or hundreds of
millions of rows. I also have a lot of RAM (16-48GB) per machine.
Some things that will affect how much ram the xts object uses are the
number of columns in your data, and whether you are using a numeric or
character xts object.
We just ran a little test here and 17.5M rows on one column take about a
third of a GB of RAM.
In a 32 bit environment, R is limited to 3GB of RAM, so this may be part
of your problem.
Last, you don't say what functions you are calling which respond slowly.
Regards,
- Brian
--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock
More information about the R-SIG-Finance
mailing list