[R] turning data with start and end date into daily data
Dimitri Liakhovitski
dimitri.liakhovitski at gmail.com
Tue May 3 22:49:46 CEST 2011
Hello!
I have data that contain, among other things the date for the
beginning and for the end of a (daily) time series (see example below
- "mydata")
mystring1<-c("String 1", "String 2")
mystring2<-c("String a", "String b")
starts<-c(as.Date("2011-02-01"),as.Date("2011-03-02"))
ends<-c(as.Date("2011-03-15"),as.Date("2011-03-31"))
values<-c(2000,10000)
mydata<-data.frame(starts=starts,ends=ends,values=values,mystring1=mystring1,mystring2=mystring2)
(mydata)
I have to reshape it so that: for each row of "mydata" I have daily
time series that start on the start date and end on the end date; what
used to be in the column "values" has to be distributed equally across
those dates; all other columns keep their original values.
My code below does it (see the end result "newdata"). However, to
achieve my goal, I am looping through rows of "mydata" - I am not sure
it will work with my real data set that already has thousands of rows
and also a lot of other columns with strings. I am afraid I'll run out
of memory. Is there maybe a way of doing it more efficiently?
Thanks a lot for your pointers!
newdata<-data.frame(mydate=NA,myvalues=NA,mystring1=NA,mystring2=NA)
for(i in 1:nrow(mydata)){ # i<-2
start.date = mydata[i,"starts"]
end.date = mydata[i,"ends"]
all.dates = seq(start.date, length = end.date - start.date, by = "day")
temp.df <- data.frame(mydate = all.dates)
temp.df$myvalues = mydata[i,"values"]/length(all.dates)
temp.df[names(mydata)[4:5]] = mydata[i,4:5]
newdata<-rbind(newdata,temp.df)
}
newdata<-newdata[-1,]
(newdata);(mydata)
--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com
More information about the R-help
mailing list