[R] row index for max values of row groups

William Dunlap wdunlap at tibco.com
Wed Nov 7 20:38:18 CET 2012


Note that the unlist(tapply()) algorithm depends on the groups column
being in order.  Here is one that works no matter how the
data frame is ordered.
  > which( with(df1, {tmp <- logical(length(groups)) ; split(tmp, groups) <- lapply(split(values, groups), function(x)x==max(x)) ; tmp}))
  [1]  4  8 11
  > df1[.Last.value, ]
     groups values
  4       1      2
  8       2      3
  11      3      4

Try reordering the data.frame and you get essentially the same result:
  > df1a <- df1[c(12,1,11,2,10,3,9,4,8,5,7,6),]
  > which( with(df1a, {tmp <- logical(length(groups)) ; split(tmp, groups) <- lapply(split(values, groups), function(x)x==max(x)) ;   tmp}))
  [1] 3 8 9
  > df1a[.Last.value, ]
     groups values
  11      3      4
  4       1      2
  8       2      3

Where unlist(tapply()) gives:
  > which(unlist(tapply(df1a$values, df1a$groups, FUN=function(x) x == max(x)), use.names=FALSE))
  [1]  4  6 10
  > df1a[.Last.value, ]
    groups values
  2      1      1
  3      1      1
  5      1      1

You could sort the data.frame and unsort the result from the tapply approach:

  > ord <- with(df1a, order(groups))
  > with(df1a[ord,], which(unlist(tapply(values, groups, FUN=function(x) x == max(x)), use.names=FALSE)[order(ord)]))
  [1] 3 8 9
  > df1a[.Last.value, ]
     groups values
  11      3      4
  4       1      2
  8       2      3

(The split()<-split() is close to what ave() does, but ave() requires that
the first argument to FUN have the same type as FUN's output but
we want a numeric input and a logical output.   Perhaps ave() could
use a new argument to handle this kind of thing.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Rui Barradas
> Sent: Wednesday, November 07, 2012 11:13 AM
> To: arun
> Cc: R help; Omphalodes Verna
> Subject: Re: [R] row index for max values of row groups
> 
> Hello,
> 
> Though my function is equal to Arun's, it's wrapped by a different way
> of returning the index.
> 
> which(unlist(tapply(df1$values, df1$groups, FUN=function(x) x == max(x))))
> 
> 
> Hope this helps,
> 
> Rui Barradas
> Em 07-11-2012 18:54, arun escreveu:
> > Hi,
> > One method will be:
> > row.names(df1[unlist(tapply(df1$values,df1$groups,FUN=function(x) x==max(x))),])
> > #[1] "4"  "8"  "11"
> > #or
> > row.names(df1[as.logical(ave(df1$values,df1$groups,FUN=function(x) x==max(x))),])
> > #[1] "4"  "8"  "11"
> > A.K.
> >
> >
> >
> >
> > ----- Original Message -----
> > From: Omphalodes Verna <omphalodes.verna at yahoo.com>
> > To: "r-help at r-project.org" <r-help at r-project.org>
> > Cc:
> > Sent: Wednesday, November 7, 2012 1:41 PM
> > Subject: [R] row index for max values of row groups
> >
> > Dear list members!
> > I am looking for ''nice solution'' for (maybe) simple problem. I need a code (small
> program) to calculate row index for max value (example below: df1$values) by groups
> (example below: df1$groups).
> > df1 <- data.frame(
> > groups = c(1,1,1,1,1,2,2,2,3,3,3,3),
> > values = c(1,1,1,2,1,1,2,3,2,1,4,3)
> > )
> > df1
> >
> > expected results
> >
> >> 4 8 11 # row index of max values by group
> > Thansk to all for help, OV
> >
> >      [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list