[R] Sorting dataframe by number of occurrences of factor

Petr Savicky savicky at praha1.ff.cuni.cz
Sat Apr 30 11:00:59 CEST 2011


On Fri, Apr 29, 2011 at 11:17:58PM -0700, adigs wrote:
> Apologies for what's probably quite simple, but I'm having some problems with
> sorting a data frame by the number of occurences of each level of a factor.
> 
> df<-data.frame(id=c(1:20),name=c('a','b','b','c','a','d','b','e','d','d','c','a','b','a','a','b','f','b','c','g'))
> 
> I want to sort the dataframe so that the values of df$name that occur most
> often are at the bottom - ie. in the order:
> 
> attributes(sort(summary(df$name)))$name = "e" "f" "g" "c" "d" "a" "b":
> 
> > sort(summary(df$name))
> e f g c d a b 
> 1 1 1 3 3 5 6 
> 
> So the desired result is:
> 
> id name
> 8    e
> 17    f
> 20    g
>  4    c
> 11    c
> 19    c
>  6    d
>  9    d
> 10    d
>   1    a
>   5    a
> 12    a
> 14    a
> 15    a
>   2    b
>   3    b
>   7    b
> 13    b
> 16    b
> 18    b

Hi.

Try the following

  freq <- ave(rep(1, times=nrow(df)), df$name, FUN=sum)
  df[order(freq, df$name), ]

Hope this helps.

Petr Savicky.



More information about the R-help mailing list