[R] Find last row (observation) for each combination of variables
Leif Kirschenbaum
leif at reflectivity.com
Tue Jan 10 22:03:48 CET 2006
Let's say I have a data.frame like
A B C TS other columns
1 1 1 12345
1 1 1 56789
1 2 1 23456
1 2 2 23457
2 4 7 23458
2 4 7 34567
2 4 7 45678
and I want the last row for each unique combination of A/B/C, where by "last" I mean greatest TS.
A B C TS other columns
1 1 1 56789
1 2 1 23456
1 2 2 23457
2 4 7 45678
I did this simply in SAS:
proc sort data=DF;
by A B C descending TS
run;
proc sort data=DF NODUPKEY;
by A B C;
run;
I tried using "aggregate" to find the maximum TS for each combination of A/B/C, but it's slow.
I also tried "by" but it's also slow.
My current (faster) solution is:
DF$abc<-paste(DF$A,DF$B,DF$C,sep="")
abclist<-unique(DF$ABC)
numtest<-length(abclist)
maxTS<-rep(0,numtest)
for(i in 1:numtest){
maxTS[i]<-max(DF$TS[DF$abc==abclist[i]],na.rm=TRUE)
}
maxTSdf<-data.frame(device=I(abc),maxTS=maxTS )
DF<-merge(DF,maxTSdf,by="abc",all.x=TRUE)
DF<-Df[DF$TS==DF$maxTS,,drop=TRUE]
DF$maxTS<-NULL
This seems a bit lengthy for such a simple task.
Any simpler suggestions?
-Leif K.
Leif Kirschenbaum
Senior Yield Engineer
Reflectivity, Inc.
(408) 737-8100 x307
leif at reflectivity.com
More information about the R-help
mailing list