[R] maintaining row connections during aggregate

Kara Przeczek przeczek at unbc.ca
Mon Jun 13 22:30:25 CEST 2011


Dear All,
I have several sets of data such as this:

  year jday  avg_m3s
1 1960    1 4.262307
2 1960    2 4.242308
3 1960    3 4.216923
4 1960    4 4.185385
5 1960    5 4.151538
6 1960    6 4.133846
 ...

There is a value for each day of multiple years. In this particular data set it goes up to 1974. I am am looking to obtain the minimum and maximum values for each year, but also know on which julian day ("jday") they occurred.
I can get the maximum value for each year with:

> mx = aggregate(ddat$avg_m3s, list(Year=ddat$year), max, na.rm=T)
> colnames(mx) <- c("year","max_daily")

   year max_daily
1  1960  60.24615
2  1961  73.90000
3  1962  56.40000
...


But I want to output the max with the corresponding day on which it occurred, such as:
  year jday  avg_m3s
1 1960    136 60.24615
2 1961    129 73.90000
3 1962    111 56.40000


I haven't been able to determine how to keep those ties without aggregating by both year *and day, which is what happened with:
aggregate(ddat$avg_m3s, list(Year=ddat$year, Day = ddat$jday), max, na.rm=T),
resulting in a value output for every single day of each year.

Other attempts to get both columns to output failed.

Any help would be greatly appreciated!
Kara



More information about the R-help mailing list