[R] merge.zoo returns unmatched dates
arun
smartpink111 at yahoo.com
Mon Oct 1 14:47:11 CEST 2012
HI,
You can also try this:
Vup<-read.table(text="
Date, Velocity_m/s
2010-01-21 07:42:00, 1.217943
2010-01-21 07:43:00, 1.624395
2010-01-21 07:44:00, 1.526379
2010-01-21 07:45:00, 1.456831
2010-01-21 07:46:00, 1.245390
2010-01-21 07:47:00, 1.374330
",sep=",",header=TRUE,stringsAsFactors=FALSE)
PAS<-read.table(text="
Date, PAS
2010-01-21 05:01:00, 0.0013938
2010-01-21 05:02:00, 0.0015331
2010-01-21 05:03:00, 0.0016725
2010-01-21 05:04:00, 0.0016725
2010-01-21 05:05:00, 0.0012265
2010-01-21 05:06:00, 0.0015889
",sep=",",header=TRUE,stringsAsFactors=FALSE)
library(xts)
PAS$Date<-as.POSIXct(PAS$Date,format="%Y-%m-%d %H:%M:%S",tz="UTC")
Vup$Date<-as.POSIXct(Vup$Date,format="%Y-%m-%d %H:%M:%S",tz="UTC")
Vupxt<-xts(Vup[,2],order.by=Vup[,1],tzone="UTC")
PASxt<-xts(PAS[,2],order.by=PAS[,1],tzone="UTC")
VUPPASxt<- merge(Vupxt,PASxt)
VUPPASzoo<-zoo(VUPPASxt)
VUPPASzoo
# Vupxt PASxt
#2010-01-21 05:01:00 NA 0.0013938
#2010-01-21 05:02:00 NA 0.0015331
#2010-01-21 05:03:00 NA 0.0016725
#2010-01-21 05:04:00 NA 0.0016725
#2010-01-21 05:05:00 NA 0.0012265
#2010-01-21 05:06:00 NA 0.0015889
#2010-01-21 07:42:00 1.217943 NA
#2010-01-21 07:43:00 1.624395 NA
#2010-01-21 07:44:00 1.526379 NA
#2010-01-21 07:45:00 1.456831 NA
#2010-01-21 07:46:00 1.245390 NA
#2010-01-21 07:47:00 1.374330 NA
str(VUPPASzoo)
#‘zoo’ series from 2010-01-21 05:01:00 to 2010-01-21 07:47:00
# Data: num [1:12, 1:2] NA NA NA NA NA ...
#- attr(*, "dimnames")=List of 2
#..$ : chr [1:12] "2010-01-21 05:01:00" "2010-01-21 05:02:00" "2010-01-21 05:03:00" "2010-01-21 05:04:00" ...
#..$ : chr [1:2] "Vupxt" "PASxt"
#Index: POSIXct[1:12], format: "2010-01-21 05:01:00" "2010-01-21 05:02:00" ...
A.K.
----- Original Message -----
From: Vindoggy ! <vindoggy at hotmail.com>
To: r-help at r-project.org
Cc:
Sent: Monday, October 1, 2012 2:29 AM
Subject: [R] merge.zoo returns unmatched dates
Sorry for the lack of reproducible data, but this seems to be a problem inherent to my dataset and I can't figure out where the issue is.
I have several data frames set up as a time series with identical POSIXct date formats. If I keep the original data in data frame format and merge them using base merge- everything is perfect and everyone is happy.
If I transform the data frames to zoo objects, and then do a merge.zoo- the data seem to become uncoupled from the original data. Even more unusual is that some dates in the new merged data set are prior to the original data set. I've attempted bellow to show what this looks like, and I hope someone has a suggestion as to what may be causing the problem.
Here is one set of data in data.frame format
head(Vup)
Date Velocity_m/s
1 2010-01-21 07:42:00 1.217943
2 2010-01-21 07:43:00 1.624395
3 2010-01-21 07:44:00 1.526379
4 2010-01-21 07:45:00 1.456831
5 2010-01-21 07:46:00 1.245390
6 2010-01-21 07:47:00 1.374330
str(Vup)
'data.frame': 7168 obs. of 2 variables:
$ Date : POSIXct, format: "2010-01-21 07:42:00" "2010-01-21 07:43:00" ...
$ Velocity_m/s: num 1.22 1.62 1.53 1.46 1.25 ...
And here is a second in data.frame format:
head(PAS)
Date PAS
1 2010-01-21 05:01:00 0.0013938
2 2010-01-21 05:02:00 0.0015331
3 2010-01-21 05:03:00 0.0016725
4 2010-01-21 05:04:00 0.0016725
5 2010-01-21 05:05:00 0.0012265
6 2010-01-21 05:06:00 0.0015889
str(PAS)
'data.frame': 5520 obs. of 2 variables:
$ Date : POSIXct, format: "2010-01-21 05:01:00" "2010-01-21 05:02:00" ...
$ PAS: num 0.00139 0.00153 0.00167 0.00167 0.00123 ...
Using zoo:
PASmin<-zoo(as.matrix(PAS[,2]),as.POSIXct(PAS[,1],format="%Y-%m-%d %H:%M:%S",tz="UTC"))
str(PASmin)
‘zoo’ series from 2010-01-21 05:01:00 to 2010-01-27 13:01:00
Data: num [1:5520, 1] 0.00139 0.00153 0.00167 0.00167 0.00123 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "PAS"
Index: POSIXct[1:5520], format: "2010-01-21 05:01:00" "2010-01-21 05:02:00" "2010-01-21 05:03:00" ...
ADP_UPmin<-zoo(as.matrix(Vup[,2]),as.POSIXct(Vup[,1], format="%Y-%m-%d %H:%M",tz="UTC"))
str(ADP_UPmin)
‘zoo’ series from 2010-01-21 07:42:00 to 2010-01-26 20:12:00
Data: num [1:7168, 1] 1.22 1.62 1.53 1.46 1.25 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr "UP_Velocity_m/s"
Index: POSIXct[1:7168], format: "2010-01-21 07:42:00" "2010-01-21 07:43:00" "2010-01-21 07:44:00" ...
And if I merge the two zoo objects I get this:
M<-merge(ADP_UPmin,PASmin)
head(M)
UP_Velocity_m/s PAS
2010-01-20 21:01:00 NA 0.0013938
2010-01-20 21:02:00 NA 0.0015331
2010-01-20 21:03:00 NA 0.0016725
2010-01-20 21:04:00 NA 0.0016725
2010-01-20 21:05:00 NA 0.0012265
2010-01-20 21:06:00 NA 0.0015889
‘zoo’ series from 2010-01-20 21:01:00 to 2010-01-27 05:01:00
Data: num [1:8499, 1:2] NA NA NA NA NA NA NA NA NA NA ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "UP_Velocity_m/s" "PAR"
Index: POSIXct[1:8499], format: "2010-01-20 21:01:00" "2010-01-20 21:02:00" "2010-01-20 21:03:00" ...
For some reason I can not figure out, even though both the PAS data frame and PAS zoo object starts at 2010-01-21 05:01:00, once merged the PAS data starts a day earlier at 2010-01-20 21:01:00. The actual numeric data looks good, but both variables have no come uncoupled from the time series dates (The Velocity data is similarity uncoupled). And as stated before, doing an non-zoo merge on the data.frame data works fine.
Anyone got any ideas what's going on?
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list