[R-SIG-Finance] Dealing with large dataset in quantmod

Thu Jan 12 20:34:14 CET 2012

On 12-01-12 01:46 PM, Gabriele Vivinetto [Public address] wrote:
> Hello to the mailing list.
> I'm a newbie in R, and this is my first post.
> I've evaluated quantmod using EOD data from yahoo, and everything went fine.
> I have a mysql database containing tick by tick data (in a format
> suitable for quantmod getSymbols.MySQL) and I have tried to use these data.
> Using a table with a small subset of the data (1000 rows) there is no
> problem.
> But if I try to use a table with all the record (~6 millions rows), R is
> very slow and memory hungry (simply speaking it crashes all the times
> after loading the data...).

You need to first determine if this is the Mysql connection, or your 
memory/OS/and R version. If you are not running a 64 bit version of R 
you may be simply hitting the limit of 32 bit machines. If you are not 
running on a machine with a lot of physical memory then you will be 
using swap, which will be slow. (These limits might also bite on the 
server side.) You should probably monitor with something like top to 
have a better idea what is going wrong. If it really is the mysql 
connection that is the problem then you may need to look at the size of 
the chunks returned in the request.

Regards,
Paul

> As a workaround I've modified the getSymbols.MySQL R function to accept
> from= and to= parameters, so the sql SELECT gives to R a desired subset
> of data, but using more than 100k records is a pain.
> Someone has a workaround or suggestions for using large datasets with
> quantmod ?
>
> Thank you !