[R] How to create a data.frame from several time series?

Robert Latest boblatest at gmail.com
Mon Apr 16 13:04:17 CEST 2012


Hello all,

please look at my code below. The problems start where it says #
PROBLEMS START HERE. Some sample data is at the very bottom.

This is the disgnostic output from the script:

> source('load.R')
  ts.null
1      NA
2      NA
3      NA
4      NA
5      NA
6      NA
[1] "Adding data" "VS1A"
  ts.null VS1A.ts.null VS1A.tts
1      NA           NA       NA
2      NA           NA       NA
3      NA           NA 1.585324
4      NA           NA 1.326600
5      NA           NA 1.914382
6      NA           NA 1.333249
[1] "Adding data" "VS1B"
Error in get(as.character(FUN), mode = "function", envir = envir) :
  object 'FUN' of mode 'function' was not found
>

 I have several issues with that.
1) Why doesn't the data frame df.all have timestamps in its first column?
2) Why aren't the additional columns named VS1A, VS1B,  but
"VS1A.ts.null", "VS1A.tts"?
3) What does the error message at the end mean, and why doen't it
occur on the first loop iteration?

It seems like I could also first create all the time series and then
use ts.union to combine them into a data frame, but I don't know how
to do that because I don't know beforehand how many series I create in
the for() loop, how to distinguish them by (unknown beforehand) tool
names, and how to supply them to ts.union.

Thanks,
robert


############################ CODE HERE

library(zoo)

ppk <- function(data, lsl, usl) {
    if (length(data) < 15) {
        return(NA)
    } else {
        return (min(mean(data)-lsl,
        usl-mean(data))/(3*sd(data)))
    }
}

load <- function(filename) {
    d <- read.table(filename,
        header=TRUE,
        sep='\t')
    # filter data
    d <- d[d$value >= 1300 & d$value <= 1500,]
    # add column for later aggregation
    d$month = as.yearmon(d$timestamp)
    return(d)
}


df <- load('data.tsv')

# create an "all-encompassing" time series to unionize the actual data with
ts.null = ts(data=NA, start=min(df$month), end=max(df$month),
frequency=12)
print(ts.null)

#
# PROBLEMS START HERE
#

df.all <- data.frame(ts.null)
# I was hoping to have a data frame with monthly time stamps in the first
# column. Not so.

for (ti in levels(df$tool)) {
    print(head(df.all))
    print(c("Adding data", ti))
    ppk <- aggregate(
            data=df[df$tool==ti,],
            value~month, ppk, lsl=1300, usl=1500)
    tts <- as.ts(zooreg(ppk$value, order.by=ppk$month, frequency=12))
# I'm hoping that zooreg() fills in empty months with NAs, but I have no
# idea how to deal with leading or trailing empty months

    df.all[ti] <- ts.union(ts.null, tts)
# This totally doesn't work as expected, and it messes up something so bad
# that the script crashes on the second iteration.

}



################################ some DF data here

             timestamp tool value    month
1  2010-01-26 08:41:04 VS1A  1400 Jan 2010
2  2010-01-26 08:44:04 VS4A  1420 Jan 2010
3  2010-01-26 10:15:45 VS4B  1400 Jan 2010
4  2010-01-26 11:37:53 VS1B  1360 Jan 2010
5  2010-01-26 12:53:53 VS1B  1380 Jan 2010
6  2010-01-26 14:48:06 VS2B  1410 Jan 2010
7  2010-01-26 14:48:29 VS2A  1410 Jan 2010
8  2010-01-26 23:21:48 VS3A  1400 Jan 2010
9  2010-01-27 07:48:15 VS1A  1420 Jan 2010
10 2010-01-27 07:48:26 VS1B  1400 Jan 2010
11 2010-01-27 07:49:51 VS2A  1410 Jan 2010
12 2010-01-27 07:50:08 VS2B  1390 Jan 2010
13 2010-01-27 12:30:02 VS3A  1400 Jan 2010
14 2010-01-27 12:30:19 VS3B  1420 Jan 2010
15 2010-01-27 12:30:36 VS4B  1420 Jan 2010
16 2010-02-08 11:47:54 VS1A  1370 Feb 2010
17 2010-02-08 11:48:06 VS1B  1370 Feb 2010
18 2010-02-08 11:49:42 VS3A  1430 Feb 2010
19 2010-02-08 11:50:09 VS3B  1350 Feb 2010
20 2010-02-08 11:51:06 VS2A  1400 Feb 2010
>



More information about the R-help mailing list