[R] merge.zoo produces non unique chron time series
Gabor Grothendieck
ggrothendieck at gmail.com
Thu Mar 18 20:05:19 CET 2010
It occurs because the two series have times that are slightly
different. For example, it may appear that the times at positions 909
and 910 are the same but in fact they are different:
> tt <- time(zoo.ts.cor)[909:910]; tt
[1] (03/09/09 12:30:00) (03/09/09 12:30:00)
> diff(as.numeric(tt))
[1] 1.818989e-12
Here are two ways to deal with this:
############
# 1. Use a different date/time class.
############
############
# 2. Round off inputs to merge:
############
rnd <- function(x) round(24 * 3600 * x + 0.5) / (24 * 3600)
Rnd <- function(x) {
time(x) <- chron(rnd(as.numeric(time(x))))
x
}
zoo.ts.cor.rnd <- merge.zoo(Rnd(zoo.ts.pot), Rnd(zoo.ts))
############
# 3. zoo provides a facility for the user to change how matching is done.
# Define a MATCH method for the class of interest. In this case
its done like this:
############
# rnd function is from above
MATCH.chron <- function(x, table, nomatch = NA, ...) {
match(rnd(as.numeric(x)), rnd(as.numeric(table)), nomatch = nomatch, ...)
}
zoo.ts.cor.ok <- merge.zoo(zoo.ts.pot, zoo.ts)
On Thu, Mar 18, 2010 at 2:00 PM, Jan Schwanbeck <jschwanb at gmail.com> wrote:
> Dear all,
>
> merge.zoo produces duplicated time indexes in the example stated below. Can
> anybody explain, what I am doing wrong?
>
> Many thanks!
>
> Jan
>
> (R 2.10.0, WindowsXP)
>
> require(chron)
> require(zoo)
>
> # create time series (zoo.ts) with no data from 31Aug2009 14:50 to 3Sep2009
> 7:25
> t.st <- chron("31/08/2009","09:20:00",format=c("d/m/Y","h:m:s"))
> t.nd <- chron("31/08/2009","14:50:00",format=c("d/m/Y","h:m:s"))
> t.indexA<-seq(t.st,t.nd,by=1/(24*12))
>
> t.st <- chron("03/09/2009","07:25:00",format=c("d/m/Y","h:m:s"))
> t.nd <- chron("03/09/2009","23:55:00",format=c("d/m/Y","h:m:s"))
> t.indexB<-seq(t.st,t.nd,by=1/(24*12))
>
> t.index<-c(t.indexA,t.indexB)
>
> zoo.ts <- zoo(0,t.index)
>
> # create reference time series (zoo.ts.pot) without missing time steps
> t.st <- head(time(zoo.ts),1)
> t.nd <- tail(time(zoo.ts),1)
> t.index<-seq(t.st,t.nd,by=1/(24*12))
> zoo.ts.pot <- zoo(NA,t.index)
>
> # merge erroneous time series with reference time series
> zoo.ts.cor <- merge.zoo(zoo.ts.pot,zoo.ts)
>
> ####################################
> #################### Do not run: Results:
> zoo.ts.pot zoo.ts
> (31/08/09 09:20:00) NA 0
> (31/08/09 09:25:00) NA 0
> (31/08/09 09:30:00) NA 0
> ...
> (31/08/09 14:50:00) NA 0
> (31/08/09 14:55:00) NA NA
> (31/08/09 15:00:00) NA NA
> (31/08/09 15:05:00) NA NA
> ...
> (03/09/09 07:15:00) NA NA
> (03/09/09 07:20:00) NA NA
> (03/09/09 07:25:00) NA 0
> (03/09/09 07:30:00) NA 0
> (03/09/09 07:35:00) NA 0
> (03/09/09 07:40:00) NA 0
> (03/09/09 07:45:00) NA 0
> (03/09/09 07:50:00) NA 0
> (03/09/09 07:55:00) NA 0
> (03/09/09 08:00:00) NA 0
> # everything o.k. up to here
> (03/09/09 08:00:00) NA NA
> # from here: every tenth time stamp is duplicated. Why?
> ...
> (03/09/09 08:45:00) NA 0
> (03/09/09 08:45:00) NA NA
> ..
> (03/09/09 09:30:00) NA 0
> (03/09/09 09:30:00) NA NA
> ...
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list