[R-SIG-Finance] Discretising intra-day data -- how to get by with less memory?

Ajay Shah ajayshah at mayin.org
Fri Nov 27 18:27:28 CET 2009


On Fri, Nov 27, 2009 at 07:37:03AM -0500, Gabor Grothendieck wrote:
> What you are asking for has the potential to create huge data sets
> depending on the time range of the data and N. What are they?   If
> that is the problem then its not just a matter of how memory intensive
> the code is but just about any manipulation will fail.  Do the
> problems arise on the aggregate or moving back and forth between zoo
> and ts?

The object I'm dealing with has 13,667,891 rows and a lot of columns.

I thought it might make sense to:

  thinz <- z[,1]
  figure out the row numbers for the aggregate(blah, tail, 1) operation in thinz
  discretised <- z[therownums,]

So instead of doing an aggregate(blah,tail,1), we'd analyse thinz and
come up with an integer vector therownums, and use that to make the
discretised object.

This would be memory efficient since thinz has only one column.

-- 
Ajay Shah                                      http://www.mayin.org/ajayshah  
ajayshah at mayin.org                             http://ajayshahblog.blogspot.com
<*(:-? - wizard who doesn't know the answer.



More information about the R-SIG-Finance mailing list