[R] Possible Improvement of the R code

Mon Sep 17 19:04:00 CEST 2012

Getting rid of one more matrix subscripting call trims the
time by another 10-20%.  With the following function the times
for 10^4 reps for a a 11-row input matrix become
      raw compiled
  f1 5.57     3.04
  f3 3.73     2.14
  f5 3.45     1.95
  f6 1.63     0.90
  f7 1.32     0.81

f7 <- function(param) {
    prev <- param[1,]
    for(i in 2:nrow(param)) {
        # do vector and matrix subscripting only once/iteration
        # and repeated sums only once/iteration.
        p1 <- prev[1]
        p2 <- prev[2]
        p42 <- prev[4]/2
        p342 <- prev[3] + p42
        p425 <- p42 + prev[5]
        param[i,] <- prev <- c(
            p342,
            p425,
            p1 * p342,
            p1 * p425 + p2 * p342,
            p2 * p425)
    }
    param
}
f7c <- cmpfun(f7)


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: William Dunlap
> Sent: Monday, September 17, 2012 8:53 AM
> To: 'Berend Hasselman'; li li
> Cc: r-help
> Subject: RE: [R] Possible Improvement of the R code
> 
> Unlike C or C++, subscripting vector or matrix is not almost free.
> Also, subscripting a matrix tends to make more time than subscripting a
> vector so it can help to pull out the previous row of params once
> per iteration, use vector subscripting on the that row to build the new
> row, and then insert the new row back into params with one matrix
> subscripting call:
> f5 <- function(param) {
>     for(i in 2:11) {
>         # do matrix subscripting only twice per iteration
>         prev <- param[i-1, ]
>         param[i,] <- c(
>             prev[3] + prev[4]/2,
>             prev[4]/2 + prev[5],
>             prev[1] * (prev[3] + prev[4]/2),
>             prev[1] * (prev[4]/2 + prev[5]) + prev[2] * (prev[3] + prev[4]/2),
>             prev[2] * (prev[4]/2 + prev[5]))
>     }
>     param
> }
> You can also do as much as possible to not repeat any subscripting or
> sums:
> f6 <- function(param) {
>     for(i in 2:11) {
>         # do any subscripting only twice/iteration and
>         # repeated sums only once/iteration.
>         prev <- param[i-1, ]
>         p1 <- prev[1]
>         p2 <- prev[2]
>         p42 <- prev[4]/2
>         p342 <- prev[3] + p42
>         p425 <- p42 + prev[5]
>         param[i,] <- c(
>             p342,
>             p425,
>             p1 * p342,
>             p1 * p425 + p2 * p342,
>             p2 * p425)
>     }
>     param
> }
> 
> I was curious about how much the compiler package could replace hand-optimization
> so I timed these with and without compiling.  (f1 and f3 are Berend's f1 and f3; I renamed
> his f2 and f4 to f1c and f3c to indicate they were the compiled versions.)
> > funs <- list(f1=f1, f1c=f1c, f3=f3, f3c=f3c, f5=f5, f5c=f5c, f6=f6, f6c=f6c)
> > z <- vapply(funs, function(f)system.time(for(i in 1:10000)f(param))[1:3], numeric(3))
> > z
>             f1  f1c   f3  f3c   f5  f5c   f6 f6c
> user.self 5.03 2.82 3.74 1.89 2.73 1.78 1.54 0.9
> sys.self  0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.0
> elapsed   5.03 2.83 3.75 1.90 2.76 1.77 1.54 0.9
> > matrix(z["user.self", ], byrow=TRUE, ncol=2, dimnames=list(c("f1","f3","f5","f6"),
> c("raw","compiled")))
>     raw compiled
> f1 5.03     2.82
> f3 3.74     1.89
> f5 2.73     1.78
> f6 1.54     0.90
> It looks like both hand- and machine-optimization help quite a bit here.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> > Of Berend Hasselman
> > Sent: Sunday, September 16, 2012 9:28 PM
> > To: li li
> > Cc: r-help
> > Subject: Re: [R] Possible Improvement of the R code
> >
> >
> > On 17-09-2012, at 00:51, li li wrote:
> >
> > > Dear all,
> > >   In the following code, I was trying to compute each row of the "param"
> > > iteratively based
> > > on the first row.
> > >   This likely is not the best way. Can anyone suggest a simpler way to
> > > improve the code.
> > >   Thanks a lot!
> > >         Hannah
> > >
> > >
> > > param <- matrix(0, 11, 5)
> > >
> > > colnames(param) <- c("p", "q", "r", "2s", "t")
> > >
> > > param[1,] <- c(0.5, 0.5, 0.4, 0.5, 0.1)
> > >
> > > for (i in 2:11){
> > >
> > > param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
> > >
> > > param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
> > >
> > > param[i,3] <- param[(i-1),1]*(param[(i-1),3]+param[(i-1),4]/2)
> > >
> > > param[i,4] <- param[(i-1),1]*(param[(i-1),4]/2+param[(i-1),5])+param[(i-1),2
> > > ]*(param[(i-1),3]+param[(i-1),4]/2)
> > >
> > > param[i,5] <- param[(i-1),2]*(param[(i-1),4]/2+param[(i-1),5])
> > >
> > > }
> > >
> >
> > You can use the compiler package.
> > It also helps if you don't repeat certain calculations. For example (param[(i-
> > 1),3]+param[(i-1),4]/2) is computed three times.
> > Once is enough.
> >
> > See this example where your code has been put in function f1. The simplified code is in
> > function f3.
> > Functions f2 and f4 are the compiled versions of f1 and f3.
> >
> > library(compiler)
> > library(rbenchmark)
> > param <- matrix(0, 11, 5)
> > colnames(param) <- c("p", "q", "r", "2s", "t")
> > param[1,] <- c(0.5, 0.5, 0.4, 0.5, 0.1)
> >
> > # your calculation
> > f1 <- function(param) {
> >     for (i in 2:11){
> >         param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
> >         param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
> >         param[i,3] <- param[(i-1),1]*(param[(i-1),3]+param[(i-1),4]/2)
> >         param[i,4] <- param[(i-1),1]*(param[(i-1),4]/2+param[(i-1),5])+param[(i-
> > 1),2]*(param[(i-1),3]+param[(i-1),4]/2)
> >         param[i,5] <- param[(i-1),2]*(param[(i-1),4]/2+param[(i-1),5])
> >     }
> >
> >     param
> > }
> >
> > f2 <- cmpfun(f1)
> >
> > # modified by replacing identical sub-expressions with result
> > f3 <- function(param) {
> >     for (i in 2:11){
> >         param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
> >         param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
> >         param[i,3] <- param[(i-1),1]*param[i,1]
> >         param[i,4] <- param[(i-1),1]*param[i,2]+param[(i-1),2]*param[i,1]
> >         param[i,5] <- param[(i-1),2]*param[i,2]
> >     }
> >
> >     param
> > }
> >
> > f4 <- cmpfun(f3)
> >
> > z1 <- f1(param)
> > z2 <- f2(param)
> > z3 <- f3(param)
> > z4 <- f4(param)
> >
> > Running in R
> >
> > > all.equal(z2,z1)
> > [1] TRUE
> > > all.equal(z3,z1)
> > [1] TRUE
> > > all.equal(z4,z1)
> > [1] TRUE
> > >
> > > benchmark(f1(param), f2(param), f3(param), f4(param),replications=5000,
> > columns=c("test", "replications", "elapsed", "relative"))
> >        test replications elapsed relative
> > 1 f1(param)         5000   3.748    2.502
> > 2 f2(param)         5000   2.104    1.405
> > 3 f3(param)         5000   2.745    1.832
> > 4 f4(param)         5000   1.498    1.000
> >
> > f4 is quite an improvement over f1.
> > It's quite possible that more can be gained but I'm too lazy to investigate further.
> >
> > Berend
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.