# [R] Summing data based on certain conditions

Phil Spector spector at stat.berkeley.edu
Fri Apr 2 16:57:20 CEST 2010

Steve -
Take a closer look at the help page for ave(), especially
the ... argument.  Try

data_ave <- ave(data\$rammday, data\$month, data\$year,FUN=mean)

(Assuming you want to calculate the mean -- your example
didn't specify a function.)

- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu

On Fri, 2 Apr 2010, Steve Murray wrote:

>
> Dear all,
>
> Thanks for the contributions so far. I've had a look at these and the closest I've come to solving it is the following:
>
>> data_ave <- ave(data\$rammday, by=c(data\$month, data\$year))
> Warning messages:
> 1: In split.default(x, g) :
>   data length is not a multiple of split variable
> 2: In split.default(seq_along(x), f, drop = drop, ...) :
>   data length is not a multiple of split variable
>
>
> I'm slightly confused by the warning message, as the data lengths do appear the same:
>
>> dim(data)
> [1] 1073    6
>> length(data\$year)
> [1] 1073
>> length(data\$month)
> [1] 1073
>
>
> Maybe the approach I'm taking is wrong. Any suggestions would be gratefully received.
>
> Many thanks,
>
> Steve
>
>
> ----------------------------------------
>> Date: Wed, 31 Mar 2010 23:31:25 +0200
>> From: Stephan.Kolassa at gmx.de
>> To: smurray444 at hotmail.com
>> CC: r-help at r-project.org
>> Subject: Re: [R] Summing data based on certain conditions
>>
>> ?by may also be helpful.
>>
>> Stephan
>>
>>
>> Steve Murray schrieb:
>>> Dear all,
>>>
>>> I have a dataset of 1073 rows, the first 15 which look as follows:
>>>
>>>> data[1:15,]
>>> date year month day rammday thmmday
>>> 1 3/8/1988 1988 3 8 1.43 0.94
>>> 2 3/15/1988 1988 3 15 2.86 0.66
>>> 3 3/22/1988 1988 3 22 5.06 3.43
>>> 4 3/29/1988 1988 3 29 18.76 10.93
>>> 5 4/5/1988 1988 4 5 4.49 2.70
>>> 6 4/12/1988 1988 4 12 8.57 4.59
>>> 7 4/16/1988 1988 4 16 31.18 22.18
>>> 8 4/19/1988 1988 4 19 19.67 12.33
>>> 9 4/26/1988 1988 4 26 3.14 1.79
>>> 10 5/3/1988 1988 5 3 11.51 6.33
>>> 11 5/10/1988 1988 5 10 5.64 2.89
>>> 12 5/17/1988 1988 5 17 37.46 20.89
>>> 13 5/24/1988 1988 5 24 9.86 9.81
>>> 14 5/31/1988 1988 5 31 13.00 8.63
>>> 15 6/7/1988 1988 6 7 0.43 0.00
>>>
>>>
>>> I am looking for a way by which I can create monthly totals of rammday (rainfall in mm/day; column 5) by doing the following:
>>>
>>> For each case where the month value and the year are the same (e.g. 3 and 1988, in the first four rows), find the mean of the the corresponding rammday values and then times by the number of days in that month (i.e. 31 in this case).
>>>
>>> Note however that the number of month values in each case isn't always the same (e.g. in this subset of data, there are 4 values for month 3, 5 for month 4 and 5 for month 5). Also the months will of course recycle for the following years, so it's not simply a case of finding a monthly total for *all* the 3s in the whole dataset, just those associated with each year in turn.
>>>
>>> How would I go about doing this in R?
>>>
>>> Any help will be gratefully received.
>>>
>>> Many thanks,
>>>
>>> Steve
>>>
>>>
>>>
>>> _________________________________________________________________
>>> We want to hear all your funny, exciting and crazy Hotmail stories. Tell us now
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
> _________________________________________________________________
>
> Do you have a story that started on Hotmail? Tell us now
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help