[R-SIG-Finance] Discretising intra-day data -- how to get by with less memory?
Gabor Grothendieck
ggrothendieck at gmail.com
Fri Nov 27 20:18:51 CET 2009
One other thing to consider is whether you could use zooreg instead of
zoo and in that case you might not need to fill in the gaps:
> z <- zoo(c(11, 13, 14), c(1, 3, 4))
> zz <- as.zooreg(z)
> zz
1 3 4
11 13 14
> lag(zz)
0 2 3
11 13 14
Note how time 3 was lagged to become time 2 even though the original
series had no time 2.
On Fri, Nov 27, 2009 at 12:27 PM, Ajay Shah <ajayshah at mayin.org> wrote:
> On Fri, Nov 27, 2009 at 07:37:03AM -0500, Gabor Grothendieck wrote:
>> What you are asking for has the potential to create huge data sets
>> depending on the time range of the data and N. What are they? If
>> that is the problem then its not just a matter of how memory intensive
>> the code is but just about any manipulation will fail. Do the
>> problems arise on the aggregate or moving back and forth between zoo
>> and ts?
>
> The object I'm dealing with has 13,667,891 rows and a lot of columns.
>
> I thought it might make sense to:
>
> thinz <- z[,1]
> figure out the row numbers for the aggregate(blah, tail, 1) operation in thinz
> discretised <- z[therownums,]
>
> So instead of doing an aggregate(blah,tail,1), we'd analyse thinz and
> come up with an integer vector therownums, and use that to make the
> discretised object.
>
> This would be memory efficient since thinz has only one column.
>
> --
> Ajay Shah http://www.mayin.org/ajayshah
> ajayshah at mayin.org http://ajayshahblog.blogspot.com
> <*(:-? - wizard who doesn't know the answer.
>
More information about the R-SIG-Finance
mailing list