[R] find most repeated item from column in dataframe
Marc Schwartz
marc_schwartz at me.com
Wed Aug 25 04:31:54 CEST 2010
How about this approach, using aggregate():
> DF
StandID PlotNum HerbNum Woody
1 001 1 1 low
2 001 2 2 medium
3 001 3 1 low
4 001 4 3 low
5 001 5 1 high
6 001 6 2 medium
7 002 1 1 high
8 002 2 2 high
9 002 3 2 low
10 002 4 3 high
11 002 5 1 high
12 002 6 2 medium
> aggregate(DF[, 3:4], list(StandID = DF$StandID),
function(x) names(which.max(table(x))))
StandID HerbNum Woody
1 001 1 low
2 002 2 high
HTH,
Marc Schwartz
On Aug 24, 2010, at 9:14 PM, Bill.Venables at csiro.au wrote:
> Do you expect this to be easy? It may be, but I can't see a particularly graceful way to do it. Here is one possible solution.
>
>> dat
> StandID PlotNum HerbNum Woody
> 1 001 1 1 low
> 2 001 2 2 medium
> 3 001 3 1 low
> 4 001 4 3 low
> 5 001 5 1 high
> 6 001 6 2 medium
> 7 002 1 1 high
> 8 002 2 2 high
> 9 002 3 2 low
> 10 002 4 3 high
> 11 002 5 1 high
> 12 002 6 2 medium
>> getMostCommon <- function(x) {
> tx <- table(x)
> m <- which(tx == max(tx))[1]
> as(names(tx)[m], class(x))
> }
>> val <- unclass(by(dat[,-1], dat$StandID, function(x) lapply(x, getMostCommon)))
>> (newDat <- cbind(StandID = names(val), as.data.frame(do.call(rbind, val))))
> StandID PlotNum HerbNum Woody
> 001 001 1 1 low
> 002 002 1 2 high
>
> This sort of gets you the answer, but it is not quite what it seems. One way to make it more manageable is
>
>> for(j in 2:ncol(newDat)) newDat[[j]] <- unlist(newDat[[j]])
>> newDat
> StandID PlotNum HerbNum Woody
> 001 001 1 1 low
> 002 002 1 2 high
>
> This is now a data frame with columns (more or less) what they appear to be.
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Randy Cass
> Sent: Wednesday, 25 August 2010 11:33 AM
> To: r-help at r-project.org
> Subject: [R] find most repeated item from column in dataframe
>
> R users,
>
> I am trying to find some way to find the value of a column that is repeated
> the most for each StandID of a dataframe. I have research methods online
> and the help page, but have had no success in finding a solution. I have
> tried using the table function but it returns items for the whole dataset
> and not by the StandID. Any help will be appreciated. Thanks in advance.
>
> R version 2.11.1
> Windows 7
> Dataframe is imported from text file
>
> StandID PlotNum HerbNum Woody
> 001 1 1 low
> 001 2 2 medium
> 001 3 1 low
> 001 4 3 low
> 001 5 1 high
> 001 6 2 medium
> 002 1 1 high
> 002 2 2 high
> 002 3 2 low
> 002 4 3 high
> 002 5 1 high
> 002 6 2 medium
>
> I would like to get the following from the dataframe
>
> StandID HerbNum Woody
> 001 1 low
> 002 2 high
>
> Thanks,
>
> Randy
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list