[R] Add rank column to data frame as in SQL...
Brigid Mooney
bkmooney at gmail.com
Fri Jun 1 17:23:03 CEST 2012
Hopefully this is an easy problem...
I'm trying to add a partitioned rank column to a data frame where the
rank is calculated separately across a partition by categories, the
way you could easily do in SQL. I found this solution in the archives
that looked like it might work:
http://tolstoy.newcastle.edu.au/R/e11/help/10/09/8675.html
The example has a data frame with several car companies, and employee
salaries within them. A column is then added to the data.frame which
should give the descending rank for each employee, partitioned by
company.
But when I implemented it, the results weren't the expected rankings.
What am I doing wrong?
set.seed(1)
DF <- data.frame(Company=sample(c("Ford","Toyota","GM"),size=18,replace=TRUE),
Person=LETTERS[1:18],Salary=runif(18)*1e5)
DF <- within(DF, rank <- ave(Salary, Company, FUN=function(x)rev(order(x))))
# Then checking each category manually
DF[DF$Company == "Ford",]
DF[DF$Company == "GM",]
DF[DF$Company == "Toyota",]
# My results show that it works for Ford and GM, but not Toyota
> DF[DF$Company == "Ford",]
Company Person Salary rank
1 Ford A 38003.52 4
5 Ford E 65167.38 2
10 Ford J 38238.80 3
11 Ford K 86969.08 1
12 Ford L 34034.90 5
> DF[DF$Company == "GM",]
Company Person Salary rank
4 GM D 21214.25 6
6 GM F 12555.51 7
7 GM G 26722.07 5
13 GM M 48208.01 4
15 GM O 49354.13 3
17 GM Q 82737.33 1
18 GM R 66846.67 2
> DF[DF$Company == "Toyota",]
Company Person Salary rank
2 Toyota B 77744.522 2
3 Toyota C 93470.523 1
8 Toyota H 38611.409 5
9 Toyota I 1339.033 3
14 Toyota N 59956.583 6
16 Toyota P 18621.760 4
For reference, I'm using R 2.11.1 on a Windows 7 machine.
Can anyone provide insight into how I am implementing this
incorrectly, or give an alternate way to add such a partitioned rank
column to a data frame?
Thanks in advance,
Brigid
More information about the R-help
mailing list