[R] find most repeated item from column in dataframe

Marc Schwartz marc_schwartz at me.com
Wed Aug 25 04:31:54 CEST 2010


How about this approach, using aggregate():

> DF
   StandID PlotNum HerbNum  Woody
1      001       1       1    low
2      001       2       2 medium
3      001       3       1    low
4      001       4       3    low
5      001       5       1   high
6      001       6       2 medium
7      002       1       1   high
8      002       2       2   high
9      002       3       2    low
10     002       4       3   high
11     002       5       1   high
12     002       6       2 medium


> aggregate(DF[, 3:4], list(StandID = DF$StandID), 
            function(x) names(which.max(table(x))))
  StandID HerbNum Woody
1     001       1   low
2     002       2  high


HTH,

Marc Schwartz


On Aug 24, 2010, at 9:14 PM, Bill.Venables at csiro.au wrote:

> Do you expect this to be easy?  It may be, but I can't see a particularly graceful way to do it.  Here is one possible solution.
> 
>> dat
>   StandID PlotNum HerbNum  Woody
> 1      001       1       1    low
> 2      001       2       2 medium
> 3      001       3       1    low
> 4      001       4       3    low
> 5      001       5       1   high
> 6      001       6       2 medium
> 7      002       1       1   high
> 8      002       2       2   high
> 9      002       3       2    low
> 10     002       4       3   high
> 11     002       5       1   high
> 12     002       6       2 medium
>> getMostCommon <- function(x) {
> 	tx <- table(x)
> 	m <- which(tx == max(tx))[1]
> 	as(names(tx)[m], class(x))
> }
>> val <- unclass(by(dat[,-1], dat$StandID, function(x) lapply(x, getMostCommon)))
>> (newDat <- cbind(StandID = names(val), as.data.frame(do.call(rbind, val))))
>    StandID PlotNum HerbNum Woody
> 001     001       1       1   low
> 002     002       1       2  high
> 
> This sort of gets you the answer, but it is not quite what it seems.  One way to make it more manageable is
> 
>> for(j in 2:ncol(newDat)) newDat[[j]] <- unlist(newDat[[j]])
>> newDat
>    StandID PlotNum HerbNum Woody
> 001     001       1       1   low
> 002     002       1       2  high
> 
> This is now a data frame with columns (more or less) what they appear to be.
> 
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Randy Cass
> Sent: Wednesday, 25 August 2010 11:33 AM
> To: r-help at r-project.org
> Subject: [R] find most repeated item from column in dataframe
> 
> R users,
> 
> I am trying to find some way to find the value of a column that is repeated
> the most for each StandID of a dataframe.  I have research methods online
> and the help page, but have had no success in finding a solution.  I have
> tried using the table function but it returns items for the whole dataset
> and not by the StandID.  Any help will be appreciated.  Thanks in advance.
> 
> R version 2.11.1
> Windows 7
> Dataframe is imported from text file
> 
> StandID     PlotNum    HerbNum      Woody
> 001            1               1                    low
> 001            2               2                    medium
> 001            3               1                    low
> 001            4               3                    low
> 001            5               1                    high
> 001            6               2                    medium
> 002            1               1                    high
> 002            2               2                    high
> 002            3               2                    low
> 002            4               3                    high
> 002            5               1                    high
> 002            6               2                    medium
> 
> I would like to get the following from the dataframe
> 
> StandID    HerbNum      Woody
> 001          1                    low
> 002          2                    high
> 
> Thanks,
> 
> Randy
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list