[R] creating chron object aggregates (e.g. sums by day)

Yves Brostaux brostaux.y at fsagx.ac.be
Wed Oct 31 10:53:05 CET 2001


Hello,

I have an alternative code to resolve your problem. Not as simple and 
elegant as I would like, but quite easy to understand and maintain. Here it 
is :

# reading your data, preventing dates to be converted as factor
raw.data <- read.table("your data file", as.is=TRUE)
# aggregating your observations by day, using a numerical conversion of the 
dates
agg.data.1 <- tapply(raw.data[[2]], as.numeric(dates(raw.data[[1]])), sum)
agg.data.1
# moving from the results' table from tapply back to a data frame with 
named rows
agg.data.2 <- data.frame(sumval = agg.data.1[1:length(agg.data.1)])
agg.data.2
# creating the (numerical) date range of your observations
date.stamp <- 
data.frame(date=min(as.numeric(row.names(agg.data.2))):max(as.numeric(row.names(agg.data.2))))
date.stamp
# merging the aggregated data and the dates stamps
agg.data.3 <- merge(date.stamp, agg.data.2, by.x="date", by.y="row.names", 
all.x=TRUE)
# replacing the NA's where there are no value by zero
agg.data.3$sumval[is.na(agg.data.3$sumval)] <- 0
# converting back the numericals to dates
agg.data.3$date <- dates(agg.data.3$date)
agg.data.3

Hope it helped a bit !

Yves.

=====================================================================
         YVES BROSTAUX - Ingénieur agronome Orientation Eaux & Forêts

         Assistant - Unité de Statistique et Informatique
         Gembloux Agricultural University
         8, avenue de la Faculté B-5030 Gembloux (Belgium)
         Tél:      +32 (0)81 62 24 69
         E-mail :  brostaux.y at fsagx.ac.be
=====================================================================

At 04:01 31/10/01, you wrote:
>Olivier Collignon wrote:
> >
> > What is the recommended/optimal way to perform aggregates on data frames
> > with chron objects?
> >
> > Here is an example:
> >
> > >raw.data
> > 1 07/09/01   4000
> > 2 07/09/01   2000
> > 3 07/11/01   1000
> > 4 07/13/01   800
> > 5 07/13/01   700
> > 6 07/16/01   600
> > 7 07/17/01   500
> >
> > I'm trying to construct a function that would first aggregate the data
> > (second column) by day (grouping by the first column) according to a
> > function (here "sum", but could be "max" or other)
> >
> > >chronaggregate(raw.data, sum, "days")   #(used "days" since 07/09/01 is
> > short for 07/09/01 00:00:00, but could be 07/09/01 00:12:34)
> > 1 07/09/01   6000      << sum of data values for day 07/09/01 from
> > raw.data
> > 2 07/11/01   1000
> > 3 07/13/01   1500      << sum of data values for day 07/13/01 from
> > raw.data
> > 4 07/16/01   600
> > 5 07/17/01   500
> >
> > and insert 0 values for days without data:
> >
> > 1 07/09/01   6000
> > 2 07/10/01   0              << inserted record
> > 3 07/11/01   1000
> > 4 07/12/01   0              << inserted record
> > 5 07/13/01   1500
> > 6 07/14/01   0              << inserted record
> > 7 07/15/01   0              << inserted record
> > 8 07/16/01   600
> > 9 07/17/01   500
> >
> > Is there a simple way to do this?
> >
> > Thanks,
> >
> > --
> > -Olivier
> >
> > --
> > Olivier Collignon
> > Loudcloud, Inc.
> > olivier at loudcloud.com
> >
> > 
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > Send "info", "help", or "[un]subscribe"
> > (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> > 
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list