[R] Second largest element from each matrix row

William Dunlap wdunlap at tibco.com
Tue Apr 26 18:27:31 CEST 2011


And I hit the send button before adding the timings for
when there were lots of columns and few rows.  f3 changes
from the best to the worst in this case.  There is rarely
one most efficient function for all datasets.

> x <- t(x)
> benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x),
replications=5, columns=c("test","replications","elapsed"),
order="elapsed")
         test replications elapsed
4 r4 <- f4(x)            5    0.19
2 r2 <- f2(x)            5    0.24
1 r1 <- f1(x)            5    0.79
3 r3 <- f3(x)            5    3.75
> identical(r1,r2) && identical(r1, r3) && identical(r1, r4)
[1] TRUE
> dim(x)
[1]      6 100000

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of William Dunlap
> Sent: Tuesday, April 26, 2011 9:11 AM
> To: peter dalgaard; David Winsemius
> Cc: r-help at r-project.org
> Subject: Re: [R] Second largest element from each matrix row
> 
> A different approach is to use order() to sort
> first by row number and then break the ties by
> value.  It is quick when there are lots of short
> rows.
> 
> > f1 <- function (x) 
> +    apply(x, 1, function(row) sort(row, decreasing = TRUE)[2])
> > f2 <- function (x) 
> +     -apply(-x, 1, function(row) sort.int(row, partial = 2)[2])
> > f3 <- function (x) 
> + {   
> +     # order by row number then by value
> +     y <- t(x)
> +     array(y[order(col(y), y)], dim(y))[nrow(y) - 1, ]
> + }
> > f4 <- function (x) 
> +     apply(x, 1, function(row) max(row[-which.max(row)]))
> > x <- matrix(runif(1e5*6), nrow=1e5)
> > library(rbenchmark)
> > benchmark(r1 <- f1(x), r2 <- f2(x), r3 <- f3(x), r4 <- f4(x),
> +     replications=5, columns=c("test","replications","elapsed"),
> order="elapsed")
>          test replications elapsed
> 3 r3 <- f3(x)            5    1.08
> 4 r4 <- f4(x)            5   12.59
> 2 r2 <- f2(x)            5   23.19
> 1 r1 <- f1(x)            5   59.54
> > identical(r1,r2) && identical(r1, r3) && identical(r1, r4)
> [1] TRUE
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com  
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org 
> > [mailto:r-help-bounces at r-project.org] On Behalf Of peter dalgaard
> > Sent: Tuesday, April 26, 2011 8:13 AM
> > To: David Winsemius
> > Cc: r-help at r-project.org
> > Subject: Re: [R] Second largest element from each matrix row
> > 
> > 
> > On Apr 26, 2011, at 14:36 , David Winsemius wrote:
> > 
> > > 
> > > On Apr 26, 2011, at 8:01 AM, Lars Bishop wrote:
> > > 
> > >> Hi,
> > >> 
> > >> I need to extract the second largest element from each row of a
> > >> matrix. Below is my solution, but I think there should be 
> > a more efficient
> > >> way to accomplish the same, or not?
> > >> 
> > >> 
> > >> set.seed(1)
> > >> a <- matrix(rnorm(9), 3 ,3)
> > >> sec.large <- as.vector(apply(a, 1, order, decreasing=T)[2,])
> > >> ans <- sapply(1:length(sec.large), function(i) a[i, 
> sec.large[i]])
> > >> ans
> > > 
> > > There are probably many but this one is reasonably compact, 
> > one-step, and readable:
> > > 
> > > > ans2 <- apply(a, 1, function(i) sort(i)[ dim(a)[2]-1])
> > > > ans2
> > > 
> > > Refreshing my mail client proves I was right about many 
> > solutions, but this is the first (so far) to use the dim attribute.
> > 
> > Anything with sort() or order() will have complexity 
> > O(n*log(n)) or worse (n is the number of columns), whereas 
> > finding the k-th largest element has complexity O(k*n). 
> > 
> > For moderate n, this may be unimportant, but you could 
> > potentially find a speedup using
> > 
> > sort.int(i, decreasing=TRUE, partial=2)[2]
> > 
> > or
> > 
> > max(i[-which.max(i)])
> > 
> > -- 
> > Peter Dalgaard
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
> > 
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list