[R] zoo performance regression noticed (1.6-5 is faster...)

Gabor Grothendieck ggrothendieck at gmail.com
Fri Nov 4 18:02:24 CET 2011


On Fri, Nov 4, 2011 at 12:56 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> On Fri, Nov 4, 2011 at 12:34 PM, James Marca
> <jmarca at translab.its.uci.edu> wrote:
>> Good morning,
>>
>> I have discovered what I believe to be a performance regression
>> between Zoo 1.6x and Zoo 1.7-6 in the application of rollapply.
>> On zoo 1.6x, rollapply of my function over my data takes about 20
>> minutes. Using 1.7-6, the same code takes about 6 hours.
>>
>> R --version
>> R version 2.13.1 (2011-07-08)
>> Copyright (C) 2011 The R Foundation for Statistical Computing
>> ISBN 3-900051-07-0
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> Two versions of zoo 1.6 run *fast*  On one machine I am running
>>
>>  less /usr/lib64/R/library/zoo/DESCRIPTION
>>  Package: zoo
>>  Version: 1.6-3
>>  Date: 2010-04-23
>>  Title: Z's ordered observations
>>  ...
>>  Packaged: 2010-04-23 07:28:47 UTC; zeileis
>>  Repository: CRAN
>>  Date/Publication: 2010-04-23 07:43:54
>>  Built: R 2.10.1; ; 2010-04-25 06:41:34 UTC; unix
>>
>> (Thankfully I forgot to upgrade.packages() on this machine!)
>>
>> On the other
>>
>>  Package: zoo
>>  Version: 1.6-5
>>  Date: 2011-04-08
>>  ...
>>  Packaged: 2011-04-08 17:13:47 UTC; zeileis
>>  Repository: CRAN
>>  Date/Publication: 2011-04-08 17:27:47
>>  Built: R 2.13.1; ; 2011-11-04 15:49:54 UTC; unix
>>
>> I have stripped out zoo 1.7-6 from all my machines.
>>
>> I tried to ensure all libraries were identical on the two machines
>> (using lsof), and after finally downgrading zoo I got the second
>> machine to be as fast as the first, so I am quite certain the
>> difference in speed is down to the Zoo version used.
>>
>> My code runs a fairly simple function over a time series using the
>> following call to process a year of 30s data (9 columns, about a
>> million rows):
>>
>>    vals <- rollapply(data=ts.data[,c(n.3.cols, o.3.cols,volocc.cols)]
>>                  ,width=40
>>                  ,FUN=rolling.function.fn(n.cols=n.3.cols,o.cols=o.3.cols,vo.cols=volocc.cols)
>>                  ,by.column=FALSE
>>                  ,align='right')
>>
>>
>> (The rolling.function.fn call returns a function that is initialized
>> with the initial call above (a trick I learned from Javascript))
>>
>> If this is a known situation with the new 1.7 generation Zoo, my
>> apologies and I'll go away.  If my code could be turned into a useful
>> test, I'd be happy to help out as much as I'm able.  Given the extreme
>> runtime difference though, I thought I should offer my help in this
>> case, since zoo is such a useful package in my work.
>
> This was a known problem and was fixed but if its still there then
> there must be some other condition under which it can occur as well.
> If you can provide a small self contained reproducible example it
> would help in tracking it down.
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

Also, as a workaround you can try this to use an old rollapply in a
new version of zoo:

library(zoo)
source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/rollapply.R?revision=817&root=zoo")
rollapply(...whatever...)

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list