[R] Summing data based on certain conditions

David Winsemius dwinsemius at comcast.net
Fri Apr 2 17:19:34 CEST 2010


On Apr 2, 2010, at 10:36 AM, Steve Murray wrote:

>
> Dear all,
>
> Thanks for the contributions so far. I've had a look at these and  
> the closest I've come to solving it is the following:
>
>> data_ave <- ave(data$rammday, by=c(data$month, data$year))
> Warning messages:
> 1: In split.default(x, g) :
>   data length is not a multiple of split variable
> 2: In split.default(seq_along(x), f, drop = drop, ...) :
>   data length is not a multiple of split variable
>
>
> I'm slightly confused by the warning message, as the data lengths do  
> appear the same:
>
>> dim(data)
> [1] 1073    6
>> length(data$year)
> [1] 1073
>> length(data$month)
> [1] 1073

All, true no doubt, but did you look at

length (c(data$month, data$year) )  # ??

-- 
David.
>
>
> Maybe the approach I'm taking is wrong. Any suggestions would be  
> gratefully received.
>
> Many thanks,
>
> Steve
>
>
> ----------------------------------------
>> Date: Wed, 31 Mar 2010 23:31:25 +0200
>> From: Stephan.Kolassa at gmx.de
>> To: smurray444 at hotmail.com
>> CC: r-help at r-project.org
>> Subject: Re: [R] Summing data based on certain conditions
>>
>> ?by may also be helpful.
>>
>> Stephan
>>
>>
>> Steve Murray schrieb:
>>> Dear all,
>>>
>>> I have a dataset of 1073 rows, the first 15 which look as follows:
>>>
>>>> data[1:15,]
>>> date year month day rammday thmmday
>>> 1 3/8/1988 1988 3 8 1.43 0.94
>>> 2 3/15/1988 1988 3 15 2.86 0.66
>>> 3 3/22/1988 1988 3 22 5.06 3.43
>>> 4 3/29/1988 1988 3 29 18.76 10.93
>>> 5 4/5/1988 1988 4 5 4.49 2.70
>>> 6 4/12/1988 1988 4 12 8.57 4.59
>>> 7 4/16/1988 1988 4 16 31.18 22.18
>>> 8 4/19/1988 1988 4 19 19.67 12.33
>>> 9 4/26/1988 1988 4 26 3.14 1.79
>>> 10 5/3/1988 1988 5 3 11.51 6.33
>>> 11 5/10/1988 1988 5 10 5.64 2.89
>>> 12 5/17/1988 1988 5 17 37.46 20.89
>>> 13 5/24/1988 1988 5 24 9.86 9.81
>>> 14 5/31/1988 1988 5 31 13.00 8.63
>>> 15 6/7/1988 1988 6 7 0.43 0.00
>>>
>>>
>>> I am looking for a way by which I can create monthly totals of  
>>> rammday (rainfall in mm/day; column 5) by doing the following:
>>>
>>> For each case where the month value and the year are the same  
>>> (e.g. 3 and 1988, in the first four rows), find the mean of the  
>>> the corresponding rammday values and then times by the number of  
>>> days in that month (i.e. 31 in this case).
>>>
>>> Note however that the number of month values in each case isn't  
>>> always the same (e.g. in this subset of data, there are 4 values  
>>> for month 3, 5 for month 4 and 5 for month 5). Also the months  
>>> will of course recycle for the following years, so it's not simply  
>>> a case of finding a monthly total for *all* the 3s in the whole  
>>> dataset, just those associated with each year in turn.
>>>
>>> How would I go about doing this in R?
>>>
>>> Any help will be gratefully received.
>>>
>>> Many thanks,
>>>
>>> Steve
>>>
>>>
>>>
>>> _________________________________________________________________
>>> We want to hear all your funny, exciting and crazy Hotmail  
>>> stories. Tell us now
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
> 		 	   		
> _________________________________________________________________
>
> Do you have a story that started on Hotmail? Tell us now
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list