[R] aggregation with extra columns
Paul Sorenson
Paul.Sorenson at vision-bio.com
Wed Feb 2 02:17:32 CET 2005
R People,
Thanks for your help on my recent questions, Excel is never going to disappear from my office but with graphics from lattice package and some other stuff in R I have been able to add some value.
I have a problem I haven't been able to figure out with aggregation, I mentioned it earlier but didn't state it very clearly.
Basically I have many "defect events" and I want to grab the most recent event for each defect number:
eg:
"date" "defectnum" "state"
2004-12-1 10 create
2004-12-2 11 create
2004-12-4 10 close
2004-12-7 11 fix
to:
"date" "defectnum" "state"
2004-12-4 10 close
2004-12-7 11 fix
Now with aggregate I can get the rows I want but not with the state "attached":
aggregate(list(date=ev$date), by=list(defectnum=ev$defectnum), max)
Gives me the rows I want but I have lost the "state". I have tried doing a merge afterwards but now I realise why they warned me avoid using dates as database keys.
What would be handy is somehow getting back the index vector from the aggregate function. I realize in the general case this wouldn't work for aggregate but in the case of min/max the result is a specific record.
Someone earlier mentioned some tricks with sort but I haven't been able to make that get to where I want.
More information about the R-help
mailing list