[Rd] An example of very slow computation

Thu Sep 1 11:50:49 CEST 2011

>>>>> Michael Lachmann <lachmann at eva.mpg.de>
>>>>>     on Fri, 19 Aug 2011 02:08:48 +0200 writes:

    > On my trials, after eliminating all the extra matrix<->dgeMatrix
    > conversions, using expm() and the method below were equally fast.

    > Michael

I'm sorry I hadn't time to get into this thread when it was
hot..
and I have told Ravi in the mean time what I would have said
*VERY* early here:

1) There's an 'expm' package --- which will be mentioned on the 
   Matrix::expm help page in the next version of Matrix ---
   that has better and faster algorithms (notably also some
   which work with *sparse* matrices!) than the one in Matrix.

2) Matrix::expm()  is really more reliable than one of the 19
   dubious ways that Peter mentioned correctly,
   and, it is really typically not a good idea to use a faster
   but considerably less reliable method.

   Reliability  is *much* *much* more important
   than speed, *really*
   and we (R-core) have always tried to emphasize this approach
   in all contexts.

3) Thanks to Ravi, an upcoming version of the 'expm' package
   will also contain a (Krylov space) version of expm()
   which is faster for the special case of evaluating
      expm(A * t) %*% v   	   {A matrix, v vector, t scalar}
   and *still* numerically reliable.

Martin Maechler, ETH Zurich

    >> Which is why I said it applies when the system is "diagonalizable".
    >> It won't work for non-diagonalizable matrix A, because T (eigenvector
    >> matrix) is singular.
    >> 
    >> Ravi.  ________________________________________ From: peter dalgaard
    >> [pdalgd at gmail.com] Sent: Thursday, August 18, 2011 6:37 PM To: Ravi
    >> Varadhan Cc: 'cberry at tajo.ucsd.edu'; r-devel at stat.math.ethz.ch;
    >> 'nashjc at uottawa.ca' Subject: Re: [Rd] An example of very slow
    >> computation
    >> 
    >> On Aug 17, 2011, at 23:24 , Ravi Varadhan wrote:
    >> 
    >>> A principled way to solve this system of ODEs is to use the idea of
    >>> "fundamental matrix", which is the same idea as that of
    >>> diagonalization of a matrix (see Boyce and DiPrima or any ODE text).
    >>> 
    >>> Here is the code for that:
    >>> 
    >>> 
    >>> nlogL2 <- function(theta){ k <- exp(theta[1:3]) sigma <-
    >>> exp(theta[4]) A <- rbind( c(-k[1], k[2]), c( k[1], -(k[2]+k[3])) ) eA
    >>> <- eigen(A) T <- eA$vectors r <- eA$values x0 <- c(0,100) Tx0 <- T
    >>> %*% x0
    >>> 
    >>> sol <- function(t) 100 - sum(T %*% diag(exp(r*t)) %*% Tx0) pred <-
    >>> sapply(dat[,1], sol) -sum(dnorm(dat[,2], mean=pred, sd=sigma,
    >>> log=TRUE)) } This is much faster than using expm(A*t), but much
    >>> slower than "by hand" calculations since we have to repeatedly do the
    >>> diagonalization.  An obvious advantage of this fuunction is that it
    >>> applies to *any* linear system of ODEs for which the eigenvalues are
    >>> real (and negative).
    >> 
    >> I believe this is method 14 of the "19 dubious ways..." (Google for
    >> it) and doesn't work for certain non-symmetric A matrices.
    >> 
    >>> 
    >>> Ravi.
    >>> 
    >>> -------------------------------------------------------
    >>> Ravi Varadhan, Ph.D.
    >>> Assistant Professor,
    >>> Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University
    >>> 
    >>> Ph. (410) 502-2619
    >>> email: rvaradhan at jhmi.edu
    >>> 
    >>> 
    >>> -----Original Message-----
    >>> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf Of Ravi Varadhan
    >>> Sent: Wednesday, August 17, 2011 2:33 PM
    >>> To: 'cberry at tajo.ucsd.edu'; r-devel at stat.math.ethz.ch; 'nashjc at uottawa.ca'
    >>> Subject: Re: [Rd] An example of very slow computation
    >>> 
    >>> Yes, the culprit is the evaluation of expm(A*t).  This is a lazy way of solving the system of ODEs, where you save analytic effort, but you pay for it dearly in terms of computational effort!
    >>> 
    >>> Even in this lazy approach, I am sure there ought to be faster ways to evaluating exponent of a matrix than that in "Matrix" package.
    >>> 
    >>> Ravi.
    >>> 
    >>> -------------------------------------------------------
    >>> Ravi Varadhan, Ph.D.
    >>> Assistant Professor,
    >>> Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University
    >>> 
    >>> Ph. (410) 502-2619
    >>> email: rvaradhan at jhmi.edu
    >>> 
    >>> -----Original Message-----
    >>> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On Behalf Of cberry at tajo.ucsd.edu
    >>> Sent: Wednesday, August 17, 2011 1:08 PM
    >>> To: r-devel at stat.math.ethz.ch
    >>> Subject: Re: [Rd] An example of very slow computation
    >>> 
    >>> John C Nash <nashjc at uottawa.ca> writes:
    >>> 
    >>>> This message is about a curious difference in timing between two ways of computing the
    >>>> same function. One uses expm, so is expected to be a bit slower, but "a bit" turned out to
    >>>> be a factor of >1000. The code is below. We would be grateful if anyone can point out any
    >>>> egregious bad practice in our code, or enlighten us on why one approach is so much slower
    >>>> than the other.
    >>> 
    >>> Looks like A*t in expm(A*t) is a "matrix".
    >>> 
    >>> 'getMethod("expm","matrix")' coerces it arg to a "Matrix", then calls
    >>> expm(), whose method coerces its arg to a "dMatrix" and calls expm(),
    >>> whose method coerces its arg to a 'dgeMatrix' and calls expm(), whose
    >>> method finally calls '.Call(dgeMatrix_exp, x)'
    >>> 
    >>> Whew!
    >>> 
    >>> The time difference between 'expm( diag(10)+1 )' and 'expm( as( diag(10)+1,
    >>> "dgeMatrix" ))' is a factor of 10 on my box.
    >>> 
    >>> Dunno 'bout the other factor of 100.
    >>> 
    >>> Chuck
    >>> 
    >>> 
    >>>> The problem arose in an activity to develop guidelines for nonlinear
    >>>> modeling in ecology (at NCEAS, Santa Barbara, with worldwide participants), and we will be
    >>>> trying to include suggestions of how to prepare problems like this for efficient and
    >>>> effective solution. The code for nlogL was the "original" from the worker who supplied the
    >>>> problem.
    >>>> 
    >>>> Best,
    >>>> 
    >>>> John Nash
    >>>> 
    >>>> --------------------------------------------------------------------------------------
    >>>> 
    >>>> cat("mineral-timing.R == benchmark MIN functions in R\n")
    >>>> #  J C Nash July 31, 2011
    >>>> 
    >>>> require("microbenchmark")
    >>>> require("numDeriv")
    >>>> library(Matrix)
    >>>> library(optimx)
    >>>> # dat<-read.table('min.dat', skip=3, header=FALSE)
    >>>> # t<-dat[,1]
    >>>> t <- c(0.77,  1.69,  2.69,  3.67,  4.69,  5.71,  7.94,  9.67, 11.77, 17.77,
    >>>> 23.77, 32.77, 40.73, 47.75, 54.90, 62.81, 72.88, 98.77, 125.92, 160.19,
    >>>> 191.15, 223.78, 287.70, 340.01, 340.95, 342.01)
    >>>> 
    >>>> # y<-dat[,2] # ?? tidy up
    >>>> y<- c(1.396, 3.784, 5.948, 7.717, 9.077, 10.100, 11.263, 11.856, 12.251, 12.699,
    >>>> 12.869, 13.048, 13.222, 13.347, 13.507, 13.628, 13.804, 14.087, 14.185, 14.351,
    >>>> 14.458, 14.756, 15.262, 15.703, 15.703, 15.703)
    >>>> 
    >>>> 
    >>>> ones<-rep(1,length(t))
    >>>> theta<-c(-2,-2,-2,-2)
    >>>> 
    >>>> 
    >>>> nlogL<-function(theta){
    >>>> k<-exp(theta[1:3])
    >>>> sigma<-exp(theta[4])
    >>>> A<-rbind(
    >>>> c(-k[1], k[2]),
    >>>> c( k[1], -(k[2]+k[3]))
    >>>> )
    >>>> x0<-c(0,100)
    >>>> sol<-function(t)100-sum(expm(A*t)%*%x0)
    >>>> pred<-sapply(dat[,1],sol)
    >>>> -sum(dnorm(dat[,2],mean=pred,sd=sigma, log=TRUE))
    >>>> }
    >>>> 
    >>>> getpred<-function(theta, t){
    >>>> k<-exp(theta[1:3])
    >>>> sigma<-exp(theta[4])
    >>>> A<-rbind(
    >>>> c(-k[1], k[2]),
    >>>> c( k[1], -(k[2]+k[3]))
    >>>> )
    >>>> x0<-c(0,100)
    >>>> sol<-function(tt)100-sum(expm(A*tt)%*%x0)
    >>>> pred<-sapply(t,sol)
    >>>> }
    >>>> 
    >>>> Mpred <- function(theta) {
    >>>> # WARNING: assumes t global
    >>>> kvec<-exp(theta[1:3])
    >>>> k1<-kvec[1]
    >>>> k2<-kvec[2]
    >>>> k3<-kvec[3]
    >>>> #   MIN problem terbuthylazene disappearance
    >>>> z<-k1+k2+k3
    >>>> y<-z*z-4*k1*k3
    >>>> l1<-0.5*(-z+sqrt(y))
    >>>> l2<-0.5*(-z-sqrt(y))
    >>>> val<-100*(1-((k1+k2+l2)*exp(l2*t)-(k1+k2+l1)*exp(l1*t))/(l2-l1))
    >>>> } # val should be a vector if t is a vector
    >>>> 
    >>>> negll <- function(theta){
    >>>> # non expm version JN 110731
    >>>> pred<-Mpred(theta)
    >>>> sigma<-exp(theta[4])
    >>>> -sum(dnorm(dat[,2],mean=pred,sd=sigma, log=TRUE))
    >>>> }
    >>>> 
    >>>> theta<-rep(-2,4)
    >>>> fand<-nlogL(theta)
    >>>> fsim<-negll(theta)
    >>>> cat("Check fn vals: expm =",fand,"   simple=",fsim,"  diff=",fand-fsim,"\n")
    >>>> 
    >>>> cat("time the function in expm form\n")
    >>>> tnlogL<-microbenchmark(nlogL(theta), times=100L)
    >>>> tnlogL
    >>>> 
    >>>> cat("time the function in simpler form\n")
    >>>> tnegll<-microbenchmark(negll(theta), times=100L)
    >>>> tnegll
    >>>> 
    >>>> ftimes<-data.frame(texpm=tnlogL$time, tsimp=tnegll$time)
    >>>> # ftimes
    >>>> 
    >>>> 
    >>>> boxplot(log(ftimes))
    >>>> title("Log times in nanoseconds for matrix exponential and simple MIN fn")
    >>>> 
    >>> 
    >>> --
    >>> Charles C. Berry                         cberry at tajo.ucsd.edu
    >>> 
    >>> ______________________________________________
    >>> R-devel at r-project.org mailing list
    >>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> --
    >> Peter Dalgaard, Professor,
    >> Center for Statistics, Copenhagen Business School
    >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    >> Phone: (+45)38153501
    >> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
    >> "Døden skal tape!" --- Nordahl Grieg