[R-SIG-Finance] efficient extraction of local extrema and zero-crossings in large multivariate zoo?
Joshua Ulrich
josh.m.ulrich at gmail.com
Thu Jun 25 22:33:59 CEST 2009
Christian,
I hope this is one step closer to what you need. I put it together
pretty quickly, so there may be (quite a few?) bugs. Proceed with
caution. apply.daily() is in the xts package.
find.extrema2 = function(myzoo) {
days = as.POSIXct(levels(as.factor(as.Date(index(myzoo)))))
ret = myzoo
ret[] <- NA
for (tcol in 1:dim(myzoo)[2]) {
max.days <- apply.daily(myzoo[,tcol],function(x) { index(x)[which.max(x)] } )
max.days <- myzoo[as.POSIXct(coredata(max.days),origin='1970-01-01'),tcol]
min.days <- apply.daily(myzoo[,tcol],function(x) { x[x<0] <- 0;
index(x)[which.min(x)] } )
min.days <- myzoo[as.POSIXct(coredata(min.days),origin='1970-01-01'),tcol]
ret[,tcol] <- cbind(ret[,tcol],rbind(min.days,max.days))[,2]
} # end tcol
ret = na.approx(ret, na.rm=FALSE)
return(ret)
}
Best,
Josh
--
http://www.fosstrading.com
On Thu, Jun 25, 2009 at 5:37 AM, Christian Gunning<icos.atropa at gmail.com> wrote:
> I have an multi-year hour-sampled multivariate zoo from which i'm
> trying to extract index() and coredata() of daily max and
> zero-crossing for each variable and use na.approx() to fill in
> intervening values, so that the resulting zoo has the same dimensions
> of the input zoo. Currently i'm looping over each column and day,
> extracting a day's worth of data with window(), and using which.max()
> to extract the "record". For short zoos it works fine. It doesn't
> scale well, though - for example, 2e4 points takes ~4-6 times longer
> than 1e4 points with 2 variables. Am i missing major bottlenecks or
> vectorization potentials here?
>
> find.extrema = function(myzoo) {
> days = as.POSIXct(levels(as.factor(as.Date(index(myzoo)))))
> ret = myzoo
> ret[] <- NA
> for (tcol in 1:dim(myzoo)[2]) {
> for (day in days) {
> this <- window(myzoo[,tcol], start=day, end=day+24*60*60-1)
> thismax <- this[which.max(this)]
> this[this<0] <- 0 ## remove negative values first
> thiszero <- this[which.min(this)]
> ret[index(ret) == index(thismax), tcol] <- coredata(thismax)
> ret[index(ret) == index(thiszero), tcol] <- coredata(thiszero)
> ### ret[index(thismax), tcol] <- coredata(thismax) ### gives an error
> } # end days
> } # end tcol
> ret = na.approx(ret)
> return(ret)
> }
>
> hours=1e4 # about 2 year's worth of hours
> tmp=(as.POSIXct('2001-01-01')+1:hours*60*60)
> tmpz=zoo(cbind(a=sin(as.integer(tmp)/1e4), b=sin(as.integer(tmp)/1.1e4)), tmp)
> system.time(tmpextrema <- find.extrema(tmpz))
>
> thanks,
> christian gunning
> university of new mexico
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only.
> -- If you want to post, subscribe first.
>
More information about the R-SIG-Finance
mailing list