[R] Fast linear convolution with R

Martin Maechler maechler at stat.math.ethz.ch
Thu Feb 27 18:42:14 CET 2014


>>>>> "MM" == Mitchell Maltenfort <mmalten at gmail.com>
>>>>>     on Thu, 27 Feb 2014 10:34:00 -0500 writes:

    MM> "Convolve" uses the FFT so probably expects powers of 2.

Not quite (but in the correct direction):

?fft  contains

     The FFT is fastest when the length of the series being transformed
     is highly composite (i.e., has many factors).  If this is not the
     case, the transform may take a long time to compute and will use a
     large amount of memory.


Additionally,

Only the default 'type' ("circular")  directly calls fft() on
the original inputs.

Your type = "open" (or type = "filter")  0-pads the inputs
x and y  from left and right :

        x <- c(rep.int(0, length(y)-1),  x)
        y <- c(y, rep.int(0, length(x) - 1)))
  
and in this case,  if the original lengths are equal, = n,
the length of the padded vectors are newn := n + n-1 = 2n-1.

Here, n = 1e7, so newn = 1999999
and that is a prime number -- i.e., the absolute worst case !!!

BTW, for  n = 1e5,  100 x smaller,  2n-1 = 199999
is also a prime number, and indeed convolve() already takes over
40 seconds on my new fast computer.

As a matter, it seem that taking   n = 10^k  for convolve,
type = "open" or "filter" is very often bad :
as 19..9 seems relatively often to be prime, and still has few
factors in the other cases :

> sfsmisc::factorize(2 * 10^(2:9) - 1)
$`199`
       p m
[1,] 199 1

$`1999`
        p m
[1,] 1999 1

$`19999`
        p m
[1,]    7 1
[2,] 2857 1

$`199999`
          p m
[1,] 199999 1

$`1999999`
        p m
[1,]   17 1
[2,]   71 1
[3,] 1657 1

$`19999999`
         p m
[1,] 2e+07 1

$`199999999`
        p m
[1,]   89 1
[2,] 1447 1
[3,] 1553 1

$`1999999999`
            p m
[1,]       31 1
[2,] 64516129 1

> 

If one does 0 padding,  one should really pad with one more, to
get length 2n instead of 2n-1:
2n is never (well, almost :-) a prime,
and n will often be composite, so  2n  will have the nice
properties that fft() wants.

Maybe allow types "1+open", "1+filter" which will pad with one
more, and hence will typically be *fast*  (and very slightly
different) ? 

Martin



    MM> You might want to look at using "filter"


    MM> On Thu, Feb 27, 2014 at 5:31 AM, Philip Wette <wette at mail.upb.de> wrote:
    >> Hello,
    >> 
    >> i am trying to compute the linear convolution of two vectors of length 1e7
    >> each.
    >> The code i am using is
    >> a = vector(length=1e7)
    >> b = vector(length=1e7)
    >> #fill a and b with data...
    >> c = convolve(a, rev(b), type="o")
    >> 
    >> Unfortunately, this computation goes on now for a very long time (currently
    >> 15h and counting).
    >> 
    >> 
    >> Does it make sense to wait a couple of hours more or do i only waste my time
    >> and resources because it will take ages?
    >> Is there maybe a better way to compute the convolution?
    >> 
    >> Or are there specific vector lengths that speed up the computation of the
    >> convolution? I for example found out that convolving vectors of length 1e5
    >> takes 3 times longer than convolving vectors of size 4e6...
    >> 
    >>> b = vector(length=4e6)
    >>> a = vector(length=4e6)
    >>> system.time( convolve(a, b, type="o") )
    >> user  system elapsed
    >> 123.796   0.196 124.132
    >>> a = vector(length=1e5)
    >>> b = vector(length=1e5)
    >>> system.time( convolve(a, b, type="o") )
    >> user  system elapsed
    >> 303.129   0.099 303.635
    >> 
    >> 
    >> 
    >> Best,
    >> 
    >> Philip
    >> 
    >> 
    >> --
    >> Philip Wette, M.Sc.             E-Mail: wette at mail.upb.de
    >> University of Paderborn         Tel.:   05251 / 60-1716
    >> Department of Computer Science
    >> Computer Networks Group         http://wwwcs.upb.de/cs/ag-karl
    >> Warburger Straße 100            Room:   O3.152
    >> 33098 Paderborn
    >> 
    >> ______________________________________________
    >> R-help at r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-help
    >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    >> and provide commented, minimal, self-contained, reproducible code.

    MM> ______________________________________________
    MM> R-help at r-project.org mailing list
    MM> https://stat.ethz.ch/mailman/listinfo/r-help
    MM> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    MM> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list