[R] Calculate average of many subsets based on columns in another dataframe
William Dunlap
wdunlap at tibco.com
Thu Feb 11 00:02:10 CET 2016
You could try pulling some of the repeated subscripting operations,
especially the insertions, out of the loop. E.g.,
values <- observations[,"values"];
date <- observations[,"date"] ;
groups$average <- vapply(seq_len(NROW(groups)), function(i)
mean(values[date >= groups[i, "start"] & date <= groups[i, "end"]]),
FUN.VALUE=0)
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Feb 10, 2016 at 12:18 PM, Peter Lomas <peter.br.lomas at gmail.com>
wrote:
> Hello, I have a dataframe with a date range, and another dataframe
> with observations by date. For each date range, I'd like to average
> the values within that range from the other dataframe. I've provided
> code below doing what I would like, but using a for loop is too
> inefficient for my actual case (takes about an hour). So I'm looking
> for a way to vectorize.
>
>
> set.seed(345)
> date.range <- seq(as.POSIXct("2015-01-01"),as.POSIXct("2015-06-01"),
> by="DSTday")
> observations <- data.frame(date=date.range, values=runif(152,1,100) )
> groups <- data.frame(start=sample(date.range[1:50], 20), end =
> sample(date.range[51:152], 20), average = NA)
>
> #Potential Solution (too inefficient)
>
> for(i in 1:NROW(groups)){
> groups[i, "average"] <- mean(observations[observations$date >=
> groups[i, "start"] & observations$date <=groups[i, "end"], "values"])
> }
>
> As an extension to this, there will end up being multiple value
> columns, and each range will also identify which column to average. I
> think if I can figure out the first problem I can try to extend it
> myself.
>
> Thanks,
> Peter
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list