[R] Compute rank within factor groups
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Fri Jul 13 01:10:41 CEST 2007
Ken Williams wrote:
> Hi,
>
> I have a data.frame which is ordered by score, and has a factor column:
>
> Browse[1]> wc[c("report","score")]
> report score
> 9 ADEA 0.96
> 8 ADEA 0.90
> 11 Asylum_FED9 0.86
> 3 ADEA 0.75
> 14 Asylum_FED9 0.60
> 5 ADEA 0.56
> 13 Asylum_FED9 0.51
> 16 Asylum_FED9 0.51
> 2 ADEA 0.42
> 7 ADEA 0.31
> 17 Asylum_FED9 0.27
> 1 ADEA 0.17
> 4 ADEA 0.17
> 6 ADEA 0.12
> 10 ADEA 0.11
> 12 Asylum_FED9 0.10
> 15 Asylum_FED9 0.09
> 18 Asylum_FED9 0.07
> Browse[1]>
>
> I need to add a column indicating rank within each factor group, which I
> currently accomplish like so:
>
> wc$rank <- 0
> for(report in as.character(unique(wc$report))) {
> wc[wc$report==report,]$rank <- 1:sum(wc$report==report)
> }
>
> I have to wonder whether there's a better way, something that gets rid of
> the for() loop using tapply() or by() or similar. But I haven't come up
> with anything.
>
> I've tried these:
>
> by(wc, wc$report, FUN=function(pr){pr$rank <- 1:nrow(pr)})
>
> by(wc, wc$report, FUN=function(pr){wc[wc$report %in% pr$report,]$rank <-
> 1:nrow(pr)})
>
> But in both cases the effect of the assignment is lost, there's no $rank
> column generated for wc.
>
> Any suggestions?
>
There's a little known and somewhat unfortunately named function called
ave() which does just that sort of thing.
> ave(wc$score, wc$report, FUN=rank)
[1] 10.0 9.0 8.0 8.0 7.0 7.0 5.5 5.5 6.0 5.0 4.0 3.5 3.5
2.0 1.0
[16] 3.0 2.0 1.0
More information about the R-help
mailing list