[Rd] An example of very slow computation
Michael Lachmann
lachmann at eva.mpg.de
Wed Aug 17 23:27:39 CEST 2011
On 17 Aug 2011, at 7:08PM, <cberry at tajo.ucsd.edu> <cberry at tajo.ucsd.edu> wrote:
> John C Nash <nashjc at uottawa.ca> writes:
>
>> This message is about a curious difference in timing between two ways of computing the
>> same function. One uses expm, so is expected to be a bit slower, but "a bit" turned out to
>> be a factor of >1000.
>
> Looks like A*t in expm(A*t) is a "matrix".
>
> 'getMethod("expm","matrix")' coerces it arg to a "Matrix", then calls
> expm(), whose method coerces its arg to a "dMatrix" and calls expm(),
> whose method coerces its arg to a 'dgeMatrix' and calls expm(), whose
> method finally calls '.Call(dgeMatrix_exp, x)'
>
> Whew!
>
> The time difference between 'expm( diag(10)+1 )' and 'expm( as( diag(10)+1,
> "dgeMatrix" ))' is a factor of 10 on my box.
>
You are right!
I was testing running nlogL below 100 times. expm() is then called 2500 times.
The total running time on my machine was 13 seconds.
If you replace in nlogL the part:
--
A<-rbind(
c(-k[1], k[2]),
c( k[1], -(k[2]+k[3]))
)
x0<-c(0,100)
sol<-function(t)100-sum(expm(A*t)%*%x0)
--
with:
--
A<-rbind(
c(-k[1], k[2]),
c( k[1], -(k[2]+k[3]))
)
A<-as(A,"dgeMatrix") # <--- this is the difference
sol<-function(t)100-sum(expm(A*t)%*%x0)
--
this time drops to 1.5 seconds (!).
At that point, expm() takes up much less time than, for example, calculating A*t in sol(), and the sum() - I think because conversions have to be done.
Thus, if m is a 2x2 dgeMatrix, then
> system.time({for(i in 1:2500) m*3.2})
user system elapsed
0.425 0.004 0.579
(i.e. 1/3 of the total time for nlogL() above)
whereas if mm=as.matrix(m), then
> system.time({for(i in 1:2500) mm*3.2})
user system elapsed
0.004 0.000 0.005
(!!)
and, similarly:
--
> system.time({for(i in 1:2500) sum(m)})
user system elapsed
0.399 0.002 0.494
> system.time({for(i in 1:2500) sum(mm)})
user system elapsed
0.002 0.000 0.028
--
whereas
> system.time({for(i in 1:2500) expm(m)})
user system elapsed
0.106 0.001 0.118
to sum it up, of 13 seconds, 11.5 were spent on conversions to dgeMatrix
0.5 are spent on multiplying a dgeMatrix by a double
0.5 are spent on summing a dgeMatrix
and 0.1 are actually spent in expm.
Michael
P.S. You could have used Rprof() to see these times, only that interpreting summaryRprof() is a bit hard. (Is there something that does summaryRprof(), but also shows the call graph?)
> Dunno 'bout the other factor of 100.
>
> Chuck
>
>
>
>
>> The problem arose in an activity to develop guidelines for nonlinear
>> modeling in ecology (at NCEAS, Santa Barbara, with worldwide participants), and we will be
>> trying to include suggestions of how to prepare problems like this for efficient and
>> effective solution. The code for nlogL was the "original" from the worker who supplied the
>> problem.
>>
>> Best,
>>
>> John Nash
>>
>> --------------------------------------------------------------------------------------
>>
>> cat("mineral-timing.R == benchmark MIN functions in R\n")
>> # J C Nash July 31, 2011
>>
>> require("microbenchmark")
>> require("numDeriv")
>> library(Matrix)
>> library(optimx)
>> # dat<-read.table('min.dat', skip=3, header=FALSE)
>> # t<-dat[,1]
>> t <- c(0.77, 1.69, 2.69, 3.67, 4.69, 5.71, 7.94, 9.67, 11.77, 17.77,
>> 23.77, 32.77, 40.73, 47.75, 54.90, 62.81, 72.88, 98.77, 125.92, 160.19,
>> 191.15, 223.78, 287.70, 340.01, 340.95, 342.01)
>>
>> # y<-dat[,2] # ?? tidy up
>> y<- c(1.396, 3.784, 5.948, 7.717, 9.077, 10.100, 11.263, 11.856, 12.251, 12.699,
>> 12.869, 13.048, 13.222, 13.347, 13.507, 13.628, 13.804, 14.087, 14.185, 14.351,
>> 14.458, 14.756, 15.262, 15.703, 15.703, 15.703)
>>
>>
>> ones<-rep(1,length(t))
>> theta<-c(-2,-2,-2,-2)
>>
>>
>> nlogL<-function(theta){
>> k<-exp(theta[1:3])
>> sigma<-exp(theta[4])
>> A<-rbind(
>> c(-k[1], k[2]),
>> c( k[1], -(k[2]+k[3]))
>> )
>> x0<-c(0,100)
>> sol<-function(t)100-sum(expm(A*t)%*%x0)
>> pred<-sapply(dat[,1],sol)
>> -sum(dnorm(dat[,2],mean=pred,sd=sigma, log=TRUE))
>> }
>>
>> getpred<-function(theta, t){
>> k<-exp(theta[1:3])
>> sigma<-exp(theta[4])
>> A<-rbind(
>> c(-k[1], k[2]),
>> c( k[1], -(k[2]+k[3]))
>> )
>> x0<-c(0,100)
>> sol<-function(tt)100-sum(expm(A*tt)%*%x0)
>> pred<-sapply(t,sol)
>> }
>>
>> Mpred <- function(theta) {
>> # WARNING: assumes t global
>> kvec<-exp(theta[1:3])
>> k1<-kvec[1]
>> k2<-kvec[2]
>> k3<-kvec[3]
>> # MIN problem terbuthylazene disappearance
>> z<-k1+k2+k3
>> y<-z*z-4*k1*k3
>> l1<-0.5*(-z+sqrt(y))
>> l2<-0.5*(-z-sqrt(y))
>> val<-100*(1-((k1+k2+l2)*exp(l2*t)-(k1+k2+l1)*exp(l1*t))/(l2-l1))
>> } # val should be a vector if t is a vector
>>
>> negll <- function(theta){
>> # non expm version JN 110731
>> pred<-Mpred(theta)
>> sigma<-exp(theta[4])
>> -sum(dnorm(dat[,2],mean=pred,sd=sigma, log=TRUE))
>> }
>>
>> theta<-rep(-2,4)
>> fand<-nlogL(theta)
>> fsim<-negll(theta)
>> cat("Check fn vals: expm =",fand," simple=",fsim," diff=",fand-fsim,"\n")
>>
>> cat("time the function in expm form\n")
>> tnlogL<-microbenchmark(nlogL(theta), times=100L)
>> tnlogL
>>
>> cat("time the function in simpler form\n")
>> tnegll<-microbenchmark(negll(theta), times=100L)
>> tnegll
>>
>> ftimes<-data.frame(texpm=tnlogL$time, tsimp=tnegll$time)
>> # ftimes
>>
>>
>> boxplot(log(ftimes))
>> title("Log times in nanoseconds for matrix exponential and simple MIN fn")
>>
>
> --
> Charles C. Berry cberry at tajo.ucsd.edu
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
More information about the R-devel
mailing list