[R] strange timings in convolve(x,y,type="open")

Wed Dec 19 08:13:33 CET 2007

Thanks Charles.  That must be it.  (Berwin also
noticed this.)

When convolve hit the wall, I switched over
to FFTW in C.  That is actually pretty nice
code which runs in n log(n) even for prime n
and takes special account of factors of n up
to about 19 or so.  So if the R team ever wants
to put in a new FFT, that looks like a good one.

But I think easier fix for me, or others in this boat,
would just be to write a new convolve(x,y) that
pads x and y with zeros out to length
2*max( length(x), length(y) ).  Then if x and y
have very composite lengths, especially powers of
2, the fft should go quickly.

-Art

Quoting "Charles C. Berry" <cberry at tajo.ucsd.edu>:

> On Tue, 18 Dec 2007, Art Owen wrote:
>
>> Dear R-ophiles,
>>
>> I've found something very odd when I apply convolve
>> to ever larger vectors.  Here is an example below
>> with vectors ranging from 2^11 to 2^17.   There is
>> a funny bump up at 2^12.  Then it gets very slow at 2^16.
>
> The time is consumed by fft, which is called with vectors of length
> 2*2^i-1 when type = 'o'
>
>> system.time( fft(seq_len(2^13-1)) )
>    user  system elapsed
>    0.31    0.00    0.32
>> system.time( fft(seq_len(2^14-1)) )
>    user  system elapsed
>       0       0       0
>>
>
>
> There are no factors of 2^13-1 or 2^17-1 or 2^18-1
>
>> for (i in 11:20 ) print( c(index=i, nfact=length(which( 0 ==
> (2^(i+1)-1)%%(2:trunc(sqrt(2^(i+1)-1)) )))))
> index nfact
>    11    11
> index nfact
>    12     0
> index nfact
>    13     3
> index nfact
>    14     3
> index nfact
>    15     7
> index nfact
>    16     0
> index nfact
>    17    15
> index nfact
>    18     0
> index nfact
>    19    23
> index nfact
>    20     5
>>
>
> It looks like the code in fft.c tries to find factors of the series
> length and works from there.
>
> So, the size of the problem evidently depends on its factors.
>
> HTH,
>
> Chuck
>
>
>>
>>
>>>  for( i in 11:20 )print( system.time(convolve(1:2^i,1:2^i,type="o")))
>>  user  system elapsed
>> 0.002   0.000   0.002
>>  user  system elapsed
>> 0.373   0.002   0.375
>>  user  system elapsed
>> 0.014   0.001   0.016
>>  user  system elapsed
>> 0.031   0.002   0.034
>>  user  system elapsed
>> 0.126   0.004   0.130
>>  user  system elapsed
>> 194.095   0.013 194.185
>>  user  system elapsed
>> 0.345   0.011   0.356
>>
>> This example is run on a fedora machine with 64 bits.  I hit the same
>> wall at 2^16 on a Macbook (Intel processor I think).  The fedora machine
>> is running R 2.5.0.  That's a bit old (April 07) but I saw no mention of
>> this speed
>> problem in some web searching, and it's not mentioned in the 2.6
>> what's new notes.
>>
>> I've rerun it and found the same bump at 12 and wall at 16.
>> The timing at 2^16  can change appreciably.  In one
>> other case it was about 270 user, 271 elapsed.
>> The 2^18 case ran for hours without ever finishing.
>>
>> At first I thought that this was a memory latency issue.  Maybe it
>> is.  But that makes it hard to explain why 2^17 works better than
>> 2^16.  I've seen that three times now, so I'm almost ready to call it
>> reproducible.
>> Also, one of the machines I'm using has lots of memory.  Maybe it's
>> a cache issue ... but that still does not explain why 2^12 is slower
>> than 2^13 or 2^16 is slower than 2^17.
>>
>> I've checked by running convolve without type="o" and I don't
>> see the wall.  Similarly fft does not have that problem.
>>
>> Here's an example without type="open"
>>> for( k in 11:20)print(system.time( convolve( 1:2^k,1:2^k)))
>>  user  system elapsed
>> 0.001   0.000   0.000
>>  user  system elapsed
>> 0.001   0.000   0.001
>>  user  system elapsed
>> 0.002   0.000   0.002
>>  user  system elapsed
>> 0.004   0.000   0.004
>>  user  system elapsed
>> 0.009   0.001   0.010
>>  user  system elapsed
>> 0.017   0.001   0.018
>>  user  system elapsed
>> 0.138   0.005   0.143
>>  user  system elapsed
>> 0.368   0.012   0.389
>>  user  system elapsed
>> 1.010   0.032   1.051
>>  user  system elapsed
>> 1.945   0.069   2.015
>>
>> This is more what I expected.  Something like N or N log(N) , with
>> the difference hard to discern in granularity and noise.
>>
>> The convolve function is not very big (see below).  When type is
>> not specified, it defaults to "circular".  So my guess is that something
>> mysterious might be happening inside the first else clause below,
>> at least on some architectures.
>>
>> -Art Owen
>>
>>
>>
>>> convolve
>> function (x, y, conj = TRUE, type = c("circular", "open", "filter"))
>> {
>>   type <- match.arg(type)
>>   n <- length(x)
>>   ny <- length(y)
>>   Real <- is.numeric(x) && is.numeric(y)
>>   if (type == "circular") {
>>       if (ny != n)
>>           stop("length mismatch in convolution")
>>   }
>>   else {
>>       n1 <- ny - 1
>>       x <- c(rep.int(0, n1), x)
>>       n <- length(y <- c(y, rep.int(0, n - 1)))
>>   }
>>   x <- fft(fft(x) * (if (conj)
>>       Conj(fft(y))
>>   else fft(y)), inv = TRUE)
>>   if (type == "filter")
>>       (if (Real)
>>           Re(x)
>>       else x)[-c(1:n1, (n - n1 + 1):n)]/n
>>   else (if (Real)
>>       Re(x)
>>   else x)/n
>> }
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Charles C. Berry                            (858) 534-2098
>                                             Dept of   
> Family/Preventive Medicine
> E mailto:cberry at tajo.ucsd.edu	            UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901