[R] get top n rows group by a column from a dataframe
William Dunlap
wdunlap at tibco.com
Thu Sep 16 20:18:08 CEST 2010
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Tan, Richard
> Sent: Thursday, September 16, 2010 8:39 AM
> To: r-help at r-project.org
> Subject: [R] get top n rows group by a column from a dataframe
>
> Hi, is there an R function like sql's TOP key word?
>
>
>
> I have a dataframe that has 3 columns: company, person, salary
>
>
>
> How do I get top 5 highest paid person for each company, and if I have
> fewer than 5 people for a company, just return all of them?
You could use ave() to assign a within-company ranking
to each person and later extract the rows with the ranks
you want (or sort by rank, etc.):
> set.seed(1)
> DF <-
data.frame(Company=sample(c("Ford","Toyota","GM"),size=18,replace=TRUE),
Person=LETTERS[1:18],Salary=runif(18)*1e5)
> DF <- within(DF, rank <- ave(Salary, Company,
FUN=function(x)rev(order(x))))
> DF
Company Person Salary rank
1 Ford A 38003.518 4
2 Toyota B 77744.522 2
3 Toyota C 93470.523 1
4 GM D 21214.252 6
5 Ford E 65167.377 2
6 GM F 12555.510 7
7 GM G 26722.067 5
8 Toyota H 38611.409 5
9 Toyota I 1339.033 3
10 Ford J 38238.796 3
11 Ford K 86969.085 1
12 Ford L 34034.900 5
13 GM M 48208.012 4
14 Toyota N 59956.583 6
15 GM O 49354.131 3
16 Toyota P 18621.760 4
17 GM Q 82737.332 1
18 GM R 66846.674 2
> subset(DF, rank==1)
Company Person Salary rank
3 Toyota C 93470.52 1
11 Ford K 86969.08 1
17 GM Q 82737.33 1
> subset(DF, rank<=5)
Company Person Salary rank
1 Ford A 38003.518 4
2 Toyota B 77744.522 2
3 Toyota C 93470.523 1
5 Ford E 65167.377 2
7 GM G 26722.067 5
8 Toyota H 38611.409 5
9 Toyota I 1339.033 3
10 Ford J 38238.796 3
11 Ford K 86969.085 1
12 Ford L 34034.900 5
13 GM M 48208.012 4
15 GM O 49354.131 3
16 Toyota P 18621.760 4
17 GM Q 82737.332 1
18 GM R 66846.674 2
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
>
>
>
> Thanks,
>
> Richard
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list