[R] importing and merging many time series
Joshua Ulrich
josh.m.ulrich at gmail.com
Mon Apr 15 14:34:07 CEST 2013
On Sun, Apr 7, 2013 at 7:40 AM, Anton Lebedevich <mabrek at gmail.com> wrote:
> Hello.
>
> I've got many (5-20k) files with time series in a text format like this:
>
> 1359635460 2.006747
> 1359635520 1.886745
> 1359635580 3.066988
> 1359635640 3.633578
> 1359635700 2.140082
> 1359635760 2.033564
> 1359635820 1.980123
> 1359635880 2.060131
> 1359635940 2.113416
> 1359636000 2.440172
>
> First field is a unix timestamp, second is a float number. Its a text
> export of http://graphite.readthedocs.org/en/latest/whisper.html
> databases. Time series could have different resolutions, start/end
> times, and possibly gaps inside.
>
> Current way of importing them:
>
> read.file <- function(file.name) {
> read.zoo(
> file.name,
> na.strings="None",
> colClasses=c("integer", "numeric"),
> col.names=c("time", basename(file.name)),
> FUN=function(t) {as.POSIXct(t, origin="1970-01-01 00:00.00", tz="UTC")},
> drop=FALSE)
> }
>
> load.metrics <- function(path=".") {
> do.call(merge.zoo, lapply(list.files(path, full.names=TRUE), read.file))
> }
>
> It works for 6k time series with 2k points in each, but fails with out
> of memory error on 16Gb box when I try to import 10k time series with
> 10k points.
>
You're trying to merge 10,000 objects in a single call. I'm not
surprised you run out of RAM.
> I've tried to make merging incremental by using Reduce but import speed
> became unacceptable:
>
This is similar to growing an object in a for loop, which is also slow.
> load.metrics <- function(path=".") {
> Reduce(
> function(a, b) {
> if (class(a) == "character") {
> a <- read.file(a)
> }
> merge.zoo(a, read.file(b))
> },
> list.files(path, full.names=TRUE))
> }
>
> Is there faster and less memory consuming way to import and merge a lot
> of time series?
>
Try something in between the two extremes (merging all objects at
once, versus merging every new object with the accumulated object).
For example, try merging 100-1000 objects at a time.
You might also benefit from converting your objects to xts, so you can
use xts' optimized merge. You can always convert the final object
back to zoo.
> Regards,
> Anton Lebedevich.
>
Best,
--
Joshua Ulrich | about.me/joshuaulrich
FOSS Trading | www.fosstrading.com
R/Finance 2013: Applied Finance with R | www.RinFinance.com
More information about the R-help
mailing list