[R] block averaging data frames

Mathew Brown mathew.brown at forst.uni-goettingen.de
Mon Dec 19 15:37:50 CET 2011


Almost except
tapply(x[4:8], x$interval, colMeans)
works but with a larger data frame I have problems, even
tapply(x[4:7], x$interval, colMeans)
gets the "arguments must have the same length" error. But they do! Any 
ideas?

Thanks for your help Jim


On 12/19/2011 2:00 PM, jim holtman wrote:
> Does this work for you:
>
>> x<- read.table(text = " date time  Voltage LwTempDownelling LwDownwelling LwDownwelling_min LwDownwelling_max LwTempUpwelling
> + 1 2011-11-01 00:00:00 2.732447            17.30          30.0
>        14.0              39.5           17.83
> + 2 2011-11-01 00:10:00 2.731534            17.46          15.3
>        11.1              24.6           17.95
> + 3 2011-11-01 00:20:00 2.731368            17.43          28.7
>        24.6              30.7           17.93
> + 4 2011-11-01 00:30:00 2.730703            17.36          40.4
>        29.8              43.5           17.86
> + 5 2011-11-01 00:40:00 2.729567            17.26          41.6
>        40.5              42.6           17.76"
> +     , header = TRUE
> +     )
>> # convert the time
>> x$timestamp<- as.POSIXct(paste(x$date, x$time))
>> # calculate the start of time ranges
>> start<- trunc(min(x$timestamp), units = 'hour')
>> # create breakpoints at 30 minutes
>> breaks<- seq(from = start
> +             , to = max(x$timestamp) + 3600  # make sure you have the
> last range
> +             , by = '30 min'
> +             )
>> # slice up the data by adding index
>> x$interval<- findInterval(x$timestamp, breaks)
>>
>> # determine colMeans
>> newData<- do.call(rbind, tapply(x[4:8], x$interval, colMeans))
>> newData<- as.data.frame(newData)
>>
>> # add the time back
>> newData$timestamp<- breaks[as.integer(rownames(newData))]
>> newData
>    LwTempDownelling LwDownwelling LwDownwelling_min LwDownwelling_max
> LwTempUpwelling
> 1         17.39667      24.66667          16.56667             31.60
>       17.90333
> 2         17.31000      41.00000          35.15000             43.05
>       17.81000
>              timestamp
> 1 2011-11-01 00:00:00
> 2 2011-11-01 00:30:00
>>
>
> On Mon, Dec 19, 2011 at 4:28 AM, Mathew Brown
> <mathew.brown at forst.uni-goettingen.de>  wrote:
>>
>> Hi there,
>>
>> This seems like it should be simple. I have a data frame of climate data
>> sampled every 10 min. I want to average the entire data frame into 30
>> min values (i.e., one value for each half hour).  Functions like
>> running.mean give me a moving average but I want to reduce the size of
>> the entire frame.. Any ideas how? Cheers!
>>
>> Example of my data
>>
>>   timestamp  Voltage LwTempDownelling LwDownwelling LwDownwelling_min LwDownwelling_max LwTempUpwelling
>> 1 2011-11-01 00:00:00 2.732447            17.30          30.0              14.0              39.5           17.83
>> 2 2011-11-01 00:10:00 2.731534            17.46          15.3              11.1              24.6           17.95
>> 3 2011-11-01 00:20:00 2.731368            17.43          28.7              24.6              30.7           17.93
>> 4 2011-11-01 00:30:00 2.730703            17.36          40.4              29.8              43.5           17.86
>> 5 2011-11-01 00:40:00 2.729567            17.26          41.6              40.5              42.6           17.76
>> 6 2011-11-01 00:50:00 2.728976            17.16          39.7
>>
>>
>> -M.B
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list