[R] Sorting dataframe by number of occurrences of factor
Petr Savicky
savicky at praha1.ff.cuni.cz
Sat Apr 30 11:00:59 CEST 2011
On Fri, Apr 29, 2011 at 11:17:58PM -0700, adigs wrote:
> Apologies for what's probably quite simple, but I'm having some problems with
> sorting a data frame by the number of occurences of each level of a factor.
>
> df<-data.frame(id=c(1:20),name=c('a','b','b','c','a','d','b','e','d','d','c','a','b','a','a','b','f','b','c','g'))
>
> I want to sort the dataframe so that the values of df$name that occur most
> often are at the bottom - ie. in the order:
>
> attributes(sort(summary(df$name)))$name = "e" "f" "g" "c" "d" "a" "b":
>
> > sort(summary(df$name))
> e f g c d a b
> 1 1 1 3 3 5 6
>
> So the desired result is:
>
> id name
> 8 e
> 17 f
> 20 g
> 4 c
> 11 c
> 19 c
> 6 d
> 9 d
> 10 d
> 1 a
> 5 a
> 12 a
> 14 a
> 15 a
> 2 b
> 3 b
> 7 b
> 13 b
> 16 b
> 18 b
Hi.
Try the following
freq <- ave(rep(1, times=nrow(df)), df$name, FUN=sum)
df[order(freq, df$name), ]
Hope this helps.
Petr Savicky.
More information about the R-help
mailing list