[R] creating chron object aggregates (e.g. sums by day)
Yves Brostaux
brostaux.y at fsagx.ac.be
Wed Oct 31 10:53:05 CET 2001
Hello,
I have an alternative code to resolve your problem. Not as simple and
elegant as I would like, but quite easy to understand and maintain. Here it
is :
# reading your data, preventing dates to be converted as factor
raw.data <- read.table("your data file", as.is=TRUE)
# aggregating your observations by day, using a numerical conversion of the
dates
agg.data.1 <- tapply(raw.data[[2]], as.numeric(dates(raw.data[[1]])), sum)
agg.data.1
# moving from the results' table from tapply back to a data frame with
named rows
agg.data.2 <- data.frame(sumval = agg.data.1[1:length(agg.data.1)])
agg.data.2
# creating the (numerical) date range of your observations
date.stamp <-
data.frame(date=min(as.numeric(row.names(agg.data.2))):max(as.numeric(row.names(agg.data.2))))
date.stamp
# merging the aggregated data and the dates stamps
agg.data.3 <- merge(date.stamp, agg.data.2, by.x="date", by.y="row.names",
all.x=TRUE)
# replacing the NA's where there are no value by zero
agg.data.3$sumval[is.na(agg.data.3$sumval)] <- 0
# converting back the numericals to dates
agg.data.3$date <- dates(agg.data.3$date)
agg.data.3
Hope it helped a bit !
Yves.
=====================================================================
YVES BROSTAUX - Ingénieur agronome Orientation Eaux & Forêts
Assistant - Unité de Statistique et Informatique
Gembloux Agricultural University
8, avenue de la Faculté B-5030 Gembloux (Belgium)
Tél: +32 (0)81 62 24 69
E-mail : brostaux.y at fsagx.ac.be
=====================================================================
At 04:01 31/10/01, you wrote:
>Olivier Collignon wrote:
> >
> > What is the recommended/optimal way to perform aggregates on data frames
> > with chron objects?
> >
> > Here is an example:
> >
> > >raw.data
> > 1 07/09/01 4000
> > 2 07/09/01 2000
> > 3 07/11/01 1000
> > 4 07/13/01 800
> > 5 07/13/01 700
> > 6 07/16/01 600
> > 7 07/17/01 500
> >
> > I'm trying to construct a function that would first aggregate the data
> > (second column) by day (grouping by the first column) according to a
> > function (here "sum", but could be "max" or other)
> >
> > >chronaggregate(raw.data, sum, "days") #(used "days" since 07/09/01 is
> > short for 07/09/01 00:00:00, but could be 07/09/01 00:12:34)
> > 1 07/09/01 6000 << sum of data values for day 07/09/01 from
> > raw.data
> > 2 07/11/01 1000
> > 3 07/13/01 1500 << sum of data values for day 07/13/01 from
> > raw.data
> > 4 07/16/01 600
> > 5 07/17/01 500
> >
> > and insert 0 values for days without data:
> >
> > 1 07/09/01 6000
> > 2 07/10/01 0 << inserted record
> > 3 07/11/01 1000
> > 4 07/12/01 0 << inserted record
> > 5 07/13/01 1500
> > 6 07/14/01 0 << inserted record
> > 7 07/15/01 0 << inserted record
> > 8 07/16/01 600
> > 9 07/17/01 500
> >
> > Is there a simple way to do this?
> >
> > Thanks,
> >
> > --
> > -Olivier
> >
> > --
> > Olivier Collignon
> > Loudcloud, Inc.
> > olivier at loudcloud.com
> >
> >
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > Send "info", "help", or "[un]subscribe"
> > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> >
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list