[Rd] convolve: request for "usual" behaviour + some improvements + some fixes

Tue Feb 6 11:51:34 CET 2007

To add to the wish-list for "convolve":

For modeling processes that decay exponentially in time, e.g.,
fluorescence, it is desirable to have a function that convolves an
arbitrary vector with an exponential using an iterative method.

In the TIMP package (which won't be on CRAN till R 2.5.0 is official, but
is for now at www.nat.vu.nl/~kate/TIMP) we implemented this
special-purpose function as "Conv1" in C to be called via .C.  It would be
nice, at least for us, to have this function (or an improved version) as
an option in "convolve".

There is more on this in Section 4.2 of
 Laptenok S, Mullen KM, Borst JW, van Stokkum IHM, Apanasovich
 VV, Visser AJWG (2007). ``Fluorescence Lifetime Imaging Microscopy (FLIM)
 Data Analysis with TIMP.'' Journal of Statistical Software, 18(4).  URL
 http://www.jstatsoft.org/v18/i08/.

> ------------------------------
>
> Message: 8
> Date: Sun,  4 Feb 2007 17:51:22 -0800
> From: hpages at fhcrc.org
> Subject: Re: [Rd] convolve: request for "usual" behaviour + some
> 	improvements + some fixes
> To: Martin Maechler <maechler at stat.math.ethz.ch>
> Cc: r-devel at r-project.org
> Message-ID: <1170640282.45c68d9aa3a30 at webmail.fhcrc.org>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hi Martin,
>
> Thanks for taking the time to read me. Here is a followup,
> it's rather long but hopefully not too long (and not too boring) ;-)
>
> Quoting Martin Maechler <maechler at stat.math.ethz.ch>:
>
> > Thank you, Herve,
> >
> > >>>>> "Herve" == Herve Pages <hpages at fhcrc.org>
> > >>>>>     on Fri, 02 Feb 2007 21:30:04 -0800 writes:
> >
> >     Herve> Last but not least: convolve2 can be made 100 times or 1000 times
> faster
> >     Herve> than convolve by choosing a power of 2 for the length of the
> fft-buffer
> >     Herve> (a length of 2^n is the best case for the fft, the worst case
> being
> when
> >     Herve> the length is a prime number):
> ...
> > The typical approach here and definitely the idea of the
> > original author of convolve() - would be to use  nextn()  here
> > instead of  "next_power_of_2()".
>
> The current implementation of convolve uses an fft-buffer of length
> nx + ny - 1 for "open" convolution, not nextn().
> The fft-based convolution is by nature "circular". However it can be
> used for "open" convolutions: the trick is to padd the input sequences
> with enough zeros to avoid the overlap inherent to circular convolution.
> Then the ouput sequence doesn't "look" circular anymore.
>
> For example, if the input sequences are
>    X =   2   5   8
>    Y = 100   1   0
> then the "circular" convolution of X and Y is
>    Z = 205 802 508
> To achieve "open" convolution, it's enough to _right_ padd X and
> Y with zeros:
>    X =   2   5   8   0
>    Y = 100   1   0   0
> then the "circular" convolution doesn't look circular anymore (it
> looks "open"):
>    Z = 200 802 508   5
>
> All you need to do is to padd enough zeros to have length X (and Y)
> >= nx + ny - 1. But you can add more zeros if you want: this doesn't
> change the result (except that you get extra zeros in the output
> sequence):
>   X =   2   5   8   0   0   0
>   Y = 100   1   0   0   0   0
>   Z = 200 802 508   5   0   0
>
> The current approach of adding the strict minimum number of necessary
> zeros could _at first glance_ seem reasonable since it minimizes
> the amount of memory used. But the problem with this approach is that the
> fft algorithm is not very good in the general case and very very bad when
> the length of the sequence is a prime number: in this case it has to
> perform n^2 complex multiplications so it can litterally take days or
> weeks when n is a big prime number (>= 10 millions).
>
> >
> > convolve() is one of the very old R functions stemming from
> > Auckland already seen in the very first R tarball that's still
> > available: Dated from June 20, 1995, with most file dates from
> > Jun 16/17, i.e. really of even older date,  the file src/rrc/fft
> > (no 'library', noR extension yet) contains definitions for  'fft', 'nextn'
> > and 'convolve' where the latter was (note the ":=" for what now
> > would be assignment to the base package)
> >
> > convolve := function (a, b, conj=F)
> > {
> > 	na <- length(a)
> > 	nb <- length(b)
> > 	n <- max(na, nb)
> > 	if (nb < n)
> > 		b <- c(b, rep(0, n - nb))
> > 	else if (na < n)
> > 		a <- c(b, rep(0, n - na))
> > 	da <- fft(a)
> > 	db <- fft(b)
> > 	if(conj) {
> > 		a <- da$x * db$x + da$y * db$y
> > 		b <- - da$x * db$y + db$x * da$y
> > 	}
> > 	else {
> > 		a <- da$x * db$x - da$y * db$y
> > 		b <- da$x * db$y + db$x * da$y
> > 	}
> > 	fft(a, b, inv=T)$x
> > }
> >
> > and just for historical fun here's the help file in that
> > cute olde help file format:
> >
> > TITLE(convolve @ Fast Convolution)
> > USAGE(
> > convolve(a, b, conj=false)
> > )
> > ARGUMENTS(
> > ARG(a,b @ the sequences to be convolved.)
> > ARG(conj @ logical, if true the transform of LANG(b) is conjugated
> > before back-transformation.)
> > )
> > DESCRIPTION(
> > LANG(convolve) uses the Fast Fourier Transform to compute the
> > convolution of the sequences given as its arguments.
> > PARA
> > Complex conjugation is useful when computing
> > autocovariances and autocorrelations by fast convolution.
> > )
> > REFERENCES(
> > Brillinger, D. R. (1981).
> > ITALIC(Time Series: Data Analysis and Theory), Second Edition.
> > San Francisco: Holden-Day.
> > )
> > SEEALSO(
> > LANG(LINK(fft)), LANG(LINK(nextn)).
> > )
>
> If you ignore the huge bug that 'a' is replaced by 'b' during the 0-padding
> operation, this one was better than the current 'convolve'. It already had
> the 'conj' argument and the default for it was FALSE so, by default,
> the convolve function was doing convolution, not cross-correlation
> (more on this below).
> But it was doing "circular" convolution only...
>
> >
> > Later I had added bits to the docu, convolve got the 'type'
> > argument, Brian also fixed code and amplified the description and provided
> > the alternative filter()  which had hencefor been recommended
> > instead of convolve for most applications.
>
> filter() might be OK for users who want filtering but what about people
> who want a convolution?
>
> Just FYI, with:
>    X =   2   5   8
>    Y = 100   1
> filter(X, Y) gives:
>    Z = 502 805  NA
> convolve(X, Y, type="filter") gives:
>    Z = 208 805
> and convolve(x, y, conj=FALSE, type="filter") gives:
>    Z = 200 502
> so there are some inconsistencies.
>
> But more important: what if I want the "usual" convolution product?
>
> convolve(X, Y, type="o") gives:
>    Z = 2 208 805 500
> and convolve(x, y, conj=FALSE, type="o") is closer to what I expect
>    Z = 5 200 802 508
> but still not there :-/
>
> IMO, having a function that does the correct convolution product is
> very usefull e.g. for polynom multiplication (if you see X and Y as
> the coefficients of polynoms Px and Py, then Z contains the coefficients
> of polynomial product Px * Py), or for fast high precision arithmetic
> (_exact_ multiplication of numbers with thousands of decimals).
> An fft-based convolve function is _the_ tool for those operations
> and of course, it shoul be fast.
>
> >
> > For back-compatibility (and reverence to the very first R code
> > writers (!?)) we did not change its behavior but rather documented it
> > more precisely.
> >
> > I haven't studied the details of all the possible padding
> > options for a long time, but I remember that 0-padding
> > (to length 'n' where 'n' is "highly composite") is only
> > approximately the same as a non-padded version -- since the
> > Fourier frequencies  2*pi*j/n depend on n.
> > As you notice, padding is often very recommendable but it's
> > strictly giving the solution to a different problem.
>
> Nope, it gives exactly the same result (modulo floating point
> rounding errors).
>
> > For that reason, ?convolve has mentioned  nextn() for 12 years
> > now, but not "urged" the padding
>
> convolve() mentions nextn() but does not use it. The user has
> no control over the length of the fft-buffer that is used.
> But it shouldn't need to: you can safely assume that the best
> choice is almost always to take the next_power_of_2 length.
> Maybe there could be a few situations where you don't want to do this.
> For example, if length(X) + length(Y) - 1 is 1031, the next power
> of 2 is 2048 so you need to (almost) double the size of X and Y.
> Maybe you are on a machine with very limited memory and you don't
> want to do this. But in this case you'd better not try to use the
> fft-based convolution at all because when n is a prime number (1031),
> then it's better to compute the convolution by just doing:
>
>    Z[i] = sum(k, X[k]*Y[i-k])
>
> You'll do much less multiplications! and they will be on
> doubles (or integers) when X and Y are double (or integer) vectors
> (with the fft-base convolution, all multiplications are one complexes).
>
> All this to say that it's only worth to use an fft-based convolution
> if the length of the fft buffer is a power of 2. And for open
> convolution you can _always_ do this. And the cost of it is nothing
> compared to the gain in speed (use twice the memory to multiply the
> speed by 100, 10000 or more!)
>
> For example, compare the speed of 'filter' (not fft-based) with the
> speed of 'convolve2':
>
>   > x <- sample(1:9, 100000, replace=TRUE)
>   > y <- rep(1:1, 10000)
>   > system.time(z1 <- filter(x, y))
>      user  system elapsed
>    31.486   0.460  32.478
>   > system.time(z2 <- convolve2(x, y, type="o"))
>      user  system elapsed
>     0.400   0.032   0.454
>
> Note that z1 starts and ends with 4999 NAs. All the values between
> (filtered signal?) are also in z2. To be precise:
>
>   all.equal(z1[5000:95000], z2[10000:100000])
>
> is TRUE.
>
>
> >
> > If we would change convolve() to behave more like your proposal
> > how many user-defined functions and scripts will give wrong
> > answers if they are not amended?
>
> With current version of convolve, they are getting wrong answers
> anyway. Nobody on this list wants this
>
>   http://www.sciencemag.org/cgi/content/summary/314/5807/1856
>
> to happen just because of a buggy convolve function in R right?
>
> > Probably only a very small fraction of all existing code (since
> > convolve() is not in wide use), but that may still be 100's of
> > cases. So that would need a the new version with a new name
> > (such as "convolve2" or maybe slightly nicer "convolve.".
> >
> > Is it worth introducing that and to start deprecating convolve() ?
> > IIRC, the last time we considered this (several years ago), we
> > had concluded "no", but maybe it's worth to get rid of this
> > infelicity rather than to document it in ever more details.
> >
>
> Here below is another 'convolve2' function (just in case). I've
> reintroduced the 'conj' argument after I did some homework
> and understood what it was really doing. _What_ it
> does is make 'convolve' return the cross-correlation of
> X and Y instead of the correlation.
> _How_ it does this is by taking the conjugate of fft(y)
> before to multiply it by fft(x). Without going into too many
> details, I'll just say that this "trick" doesn't play well
> with the type="o" option hence the special treatment of
> y when 'type' is not "circular" and 'conj' is TRUE.
> Also, I've reset the default to FALSE so by default convolve2
> does convolution, not cross-correlation.
>
> As for the "filter" type, it seems that the idea of its original
> author was to truncate the result of the convolution but it's
> not clear to me where this truncation should occur (to the right?
> to the left?). IMO, the convolution product of X by Y _is_ the
> result of X filtered by Y, it's just that one terminology comes
> from maths and the other from signal processing. And there is nothing
> to truncate because if the filter (Y) has a length >= 2, then it's
> normal to expect a filtered signal longer than X so I would tend to
> say that the "filter" feature is a broken concept.
>
> Best,
> H.
>
>
> convolve2 <- function(x, y, conj=FALSE, type=c("circular", "open", "filter"))
> {
>     type <- match.arg(type)
>     nx <- length(x)
>     ny <- length(y)
>     if (type == "circular") {
>         fft_length <- max(nx, ny)
>     } else {
>         if (conj) {
>             y <- rev(y)
>             if (is.complex(y))
>                 y <- Conj(y)
>             conj <- FALSE
>         }
>         nz <- nx + ny - 1
>         fft_length <- 2^ceiling(log2(nz))
>     }
>     if (fft_length > nx)
>         x[(nx+1):fft_length] <- as.integer(0)
>     if (fft_length > ny)
>         y[(ny+1):fft_length] <- as.integer(0)
>     fy <- fft(y)
>     if (conj)
>         fy <- Conj(fy)
>     z <- fft(fft(x) * fy, inverse=TRUE) / length(x)
>     if (type == "open") {
>         z <- z[1:nz]
>     } else {
>         if (type == "filter")
>             z <- z[1:nx]
>     }
>     if (is.numeric(x) && is.numeric(y))
>         z <- Re(z)
>     if (is.integer(x) && is.integer(y))
>         z <- as.integer(round(z))
>     z
> }
>
>

----
Katharine Mullen
Department of Physics and Astronomy
Faculty of Sciences
Vrije Universiteit Amsterdam
de Boelelaan 1081
1081 HV Amsterdam
The Netherlands
room: T.1.06
tel: +31 205987870
fax: +31 205987992
e-mail: kate at nat.vu.nl
http://www.nat.vu.nl/~kate/