[R] Possible Improvement of the R code
William Dunlap
wdunlap at tibco.com
Mon Sep 17 19:04:00 CEST 2012
Getting rid of one more matrix subscripting call trims the
time by another 10-20%. With the following function the times
for 10^4 reps for a a 11-row input matrix become
raw compiled
f1 5.57 3.04
f3 3.73 2.14
f5 3.45 1.95
f6 1.63 0.90
f7 1.32 0.81
f7 <- function(param) {
prev <- param[1,]
for(i in 2:nrow(param)) {
# do vector and matrix subscripting only once/iteration
# and repeated sums only once/iteration.
p1 <- prev[1]
p2 <- prev[2]
p42 <- prev[4]/2
p342 <- prev[3] + p42
p425 <- p42 + prev[5]
param[i,] <- prev <- c(
p342,
p425,
p1 * p342,
p1 * p425 + p2 * p342,
p2 * p425)
}
param
}
f7c <- cmpfun(f7)
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: William Dunlap
> Sent: Monday, September 17, 2012 8:53 AM
> To: 'Berend Hasselman'; li li
> Cc: r-help
> Subject: RE: [R] Possible Improvement of the R code
>
> Unlike C or C++, subscripting vector or matrix is not almost free.
> Also, subscripting a matrix tends to make more time than subscripting a
> vector so it can help to pull out the previous row of params once
> per iteration, use vector subscripting on the that row to build the new
> row, and then insert the new row back into params with one matrix
> subscripting call:
> f5 <- function(param) {
> for(i in 2:11) {
> # do matrix subscripting only twice per iteration
> prev <- param[i-1, ]
> param[i,] <- c(
> prev[3] + prev[4]/2,
> prev[4]/2 + prev[5],
> prev[1] * (prev[3] + prev[4]/2),
> prev[1] * (prev[4]/2 + prev[5]) + prev[2] * (prev[3] + prev[4]/2),
> prev[2] * (prev[4]/2 + prev[5]))
> }
> param
> }
> You can also do as much as possible to not repeat any subscripting or
> sums:
> f6 <- function(param) {
> for(i in 2:11) {
> # do any subscripting only twice/iteration and
> # repeated sums only once/iteration.
> prev <- param[i-1, ]
> p1 <- prev[1]
> p2 <- prev[2]
> p42 <- prev[4]/2
> p342 <- prev[3] + p42
> p425 <- p42 + prev[5]
> param[i,] <- c(
> p342,
> p425,
> p1 * p342,
> p1 * p425 + p2 * p342,
> p2 * p425)
> }
> param
> }
>
> I was curious about how much the compiler package could replace hand-optimization
> so I timed these with and without compiling. (f1 and f3 are Berend's f1 and f3; I renamed
> his f2 and f4 to f1c and f3c to indicate they were the compiled versions.)
> > funs <- list(f1=f1, f1c=f1c, f3=f3, f3c=f3c, f5=f5, f5c=f5c, f6=f6, f6c=f6c)
> > z <- vapply(funs, function(f)system.time(for(i in 1:10000)f(param))[1:3], numeric(3))
> > z
> f1 f1c f3 f3c f5 f5c f6 f6c
> user.self 5.03 2.82 3.74 1.89 2.73 1.78 1.54 0.9
> sys.self 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.0
> elapsed 5.03 2.83 3.75 1.90 2.76 1.77 1.54 0.9
> > matrix(z["user.self", ], byrow=TRUE, ncol=2, dimnames=list(c("f1","f3","f5","f6"),
> c("raw","compiled")))
> raw compiled
> f1 5.03 2.82
> f3 3.74 1.89
> f5 2.73 1.78
> f6 1.54 0.90
> It looks like both hand- and machine-optimization help quite a bit here.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> > Of Berend Hasselman
> > Sent: Sunday, September 16, 2012 9:28 PM
> > To: li li
> > Cc: r-help
> > Subject: Re: [R] Possible Improvement of the R code
> >
> >
> > On 17-09-2012, at 00:51, li li wrote:
> >
> > > Dear all,
> > > In the following code, I was trying to compute each row of the "param"
> > > iteratively based
> > > on the first row.
> > > This likely is not the best way. Can anyone suggest a simpler way to
> > > improve the code.
> > > Thanks a lot!
> > > Hannah
> > >
> > >
> > > param <- matrix(0, 11, 5)
> > >
> > > colnames(param) <- c("p", "q", "r", "2s", "t")
> > >
> > > param[1,] <- c(0.5, 0.5, 0.4, 0.5, 0.1)
> > >
> > > for (i in 2:11){
> > >
> > > param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
> > >
> > > param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
> > >
> > > param[i,3] <- param[(i-1),1]*(param[(i-1),3]+param[(i-1),4]/2)
> > >
> > > param[i,4] <- param[(i-1),1]*(param[(i-1),4]/2+param[(i-1),5])+param[(i-1),2
> > > ]*(param[(i-1),3]+param[(i-1),4]/2)
> > >
> > > param[i,5] <- param[(i-1),2]*(param[(i-1),4]/2+param[(i-1),5])
> > >
> > > }
> > >
> >
> > You can use the compiler package.
> > It also helps if you don't repeat certain calculations. For example (param[(i-
> > 1),3]+param[(i-1),4]/2) is computed three times.
> > Once is enough.
> >
> > See this example where your code has been put in function f1. The simplified code is in
> > function f3.
> > Functions f2 and f4 are the compiled versions of f1 and f3.
> >
> > library(compiler)
> > library(rbenchmark)
> > param <- matrix(0, 11, 5)
> > colnames(param) <- c("p", "q", "r", "2s", "t")
> > param[1,] <- c(0.5, 0.5, 0.4, 0.5, 0.1)
> >
> > # your calculation
> > f1 <- function(param) {
> > for (i in 2:11){
> > param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
> > param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
> > param[i,3] <- param[(i-1),1]*(param[(i-1),3]+param[(i-1),4]/2)
> > param[i,4] <- param[(i-1),1]*(param[(i-1),4]/2+param[(i-1),5])+param[(i-
> > 1),2]*(param[(i-1),3]+param[(i-1),4]/2)
> > param[i,5] <- param[(i-1),2]*(param[(i-1),4]/2+param[(i-1),5])
> > }
> >
> > param
> > }
> >
> > f2 <- cmpfun(f1)
> >
> > # modified by replacing identical sub-expressions with result
> > f3 <- function(param) {
> > for (i in 2:11){
> > param[i,1] <- param[(i-1),3]+param[(i-1),4]/2
> > param[i,2] <- param[(i-1),4]/2+param[(i-1),5]
> > param[i,3] <- param[(i-1),1]*param[i,1]
> > param[i,4] <- param[(i-1),1]*param[i,2]+param[(i-1),2]*param[i,1]
> > param[i,5] <- param[(i-1),2]*param[i,2]
> > }
> >
> > param
> > }
> >
> > f4 <- cmpfun(f3)
> >
> > z1 <- f1(param)
> > z2 <- f2(param)
> > z3 <- f3(param)
> > z4 <- f4(param)
> >
> > Running in R
> >
> > > all.equal(z2,z1)
> > [1] TRUE
> > > all.equal(z3,z1)
> > [1] TRUE
> > > all.equal(z4,z1)
> > [1] TRUE
> > >
> > > benchmark(f1(param), f2(param), f3(param), f4(param),replications=5000,
> > columns=c("test", "replications", "elapsed", "relative"))
> > test replications elapsed relative
> > 1 f1(param) 5000 3.748 2.502
> > 2 f2(param) 5000 2.104 1.405
> > 3 f3(param) 5000 2.745 1.832
> > 4 f4(param) 5000 1.498 1.000
> >
> > f4 is quite an improvement over f1.
> > It's quite possible that more can be gained but I'm too lazy to investigate further.
> >
> > Berend
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list