[R-SIG-Finance] R Memory Usage

Jeffrey Ryan jeffrey.ryan at lemnica.com
Sun Apr 10 21:14:42 CEST 2011


Elliot,

One of the advantages to posting to the finance list is that those of
us who work around large data in finance can comment on tools that you
use as well.

One thing you didn't mention specifically was which packages you are
using and maybe examples of specific code you are calling.

Within financial time-series, one of the most optimized tools is xts -
precisely for the reason of memory management and optimizations for
large data.  Using something ad-hoc, for example strings and
data.frames - would cause tremendous issues.

Another issue would be whether or not you need the full data resident
in memory at all times.  R's rds format, or a database, or use of
out-of-core objects such as with mmap or indexing - can greatly
improve things.

If you are able to come to the R/Finance conference in Chicago on the
29th and 30th of this month, you'll have a chance to talk to some of
those 'in the trenches' with respect to using R on big data.  And as
you point our (as well as Brian) - 800x3000 isn't very large, so your
case isn't unique.

Would be great to see you later this month in Chicago!  www.RinFinance.com

Best,
Jeff



On Sun, Apr 10, 2011 at 10:49 AM, Elliot Joel Bernstein
<elliot.bernstein at fdopartners.com> wrote:
> This is not specifically a finance question, but I'm working with financial
> data (daily stock returns), and I suspect many people using R for financial
> analysis face similar issues. The basic problem I'm having is that with a
> moderately large data set (800 stocks x 11 years), performing a few
> operations such as data transformations, fitting regressions, etc., results
> in R using an enormous amount of memory -- sometimes upwards of 5GB -- even
> after using gc() to try and free some memory up. I've read several posts to
> various R mailing lists over the years indicating that R does not release
> memory back to the system on certain OSs (64 bit Linux in my case), so I
> understand that this is "normal" behavior for R. How do people typically
> work around this to do exploratory analysis on large data sets without
> having to constantly restart R to free up memory?
>
> Thanks.
>
> - Elliot Joel Bernstein
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-Finance at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.
>



-- 
Jeffrey Ryan
jeffrey.ryan at lemnica.com

www.lemnica.com

R/Finance 2011 April 29th and 30th in Chicago | www.RinFinance.com



More information about the R-SIG-Finance mailing list