[R-SIG-Finance] troubles with apply.daily

Ted Byers r.ted.byers at gmail.com
Mon Jan 30 06:13:52 CET 2012


Hi Michael,

Thanks

With:

> x <- xts(1:500, Sys.Date() + 1:500)
> 
> apply.weekly(x, max)
> apply.weekly(x, str)

apply.weekly(x,max) works fine.

apply.weekly(x,str) generates the same error that I was seeing with my data.

Alas, that means the error I am getting with rollapply is due to a different
cause.  :-(

Here are the functions involved in rollapply:

rootFun <- function(pr) {
  rv = 100000000000000000000
  n = length(pr)
  for (i in 1:n) {
    a = Re(pr[i])
    b = abs(Im(pr[i]))
    if (b < 0.0000001) {
      if (abs(a) < 0.25) {
        rv = a
      } else {
        if ((a < 0) && (a > rv)) {
          rv = a
        } else {
          if (rv > a) {
            rv = a
          }
        }
      }
    }
  }
  rv
}

rollRegFun <- function(d) {
  cmultiplier <- 5
  d <- data.frame(d, rt=seq(1-nrow(d),0));
  polyfit <- lm(Close ~ poly(rt,4),d)
  z <- summary(polyfit)
  r2 <- z$r.squared
  confint(polyfit)
  p <- coef(polyfit)
  pr <- polyroot(c(p[2],2*p[3],3*p[4],4*p[5]))
  dd <- function(x) {  rv = 2*p[3]+6*p[4]*x+12*p[5]*x*x;rv;}
  r <- as.double(rootFun(pr))
  acc <- dd(r)
  sig <- 'U'
  n <- nrow(d)
  sigprice <- d$Close[n]
  ub <- 0
  lb <- 0
  stopprice <- 0
  ci <- confint(polyfit, level=0.999)
  if (abs(r) < 0.25) {
    lb <- ci[1]
    ub <- ci[6]
#    sigprice <- as.double(p[1])
    sigprice <- d$Close[n]
    if (acc > 0) {
      sig <- 'B'
      sigprice <- floor(sigprice/cmultiplier)*cmultiplier
      stopprice <- floor(lb/cmultiplier)*cmultiplier
      if ((sigprice - stopprice) < 15) {
        stopprice = sigprice - 15
      }
    } else {
      sig <- 'S'
      sigprice <- ceiling(sigprice/cmultiplier)*cmultiplier
      stopprice <- ceiling(ub/cmultiplier)*cmultiplier
      if ((stopprice - sigprice) < 15) {
        stopprice = sigprice + 15
      }
    }
  }
  rv <- data.frame(r,acc,sig,sigprice,lb,ub,stopprice, r2)
  rv
}

Do these tell you anything about what may be awry with my use of rollapply?
And is there a way to get R to give more detail about what it is that it
doesn't like, and what part of my code is to blame?

Thanks again

Ted


> -----Original Message-----
> From: R. Michael Weylandt [mailto:michael.weylandt at gmail.com]
> Sent: January-29-12 11:59 PM
> To: Ted Byers
> Cc: r-sig-finance at r-project.org
> Subject: Re: [R-SIG-Finance] troubles with apply.daily
> 
> I think I've got it now: consider the following:
> 
> x <- xts(1:500, Sys.Date() + 1:500)
> 
> apply.weekly(x, max)
> apply.weekly(x, str)
> 
> The error message is a little subtle: the problem isn't in
> apply.weekly() but rather in coredata, which means it's actually in the
function
> being used to construct the return value. Specifically, your problem comes
from
> the fact that str() is one of those rare functions called for its
side-effects not its
> return value. str() actually always returns NULL invisibly (the printed
stuff is a
> side
> effect) and it's impossible to have a NULL (or actually lots of them) in
the
> values for an xts object. I believe if you run apply.weekly with a "real
function"
> like max or Hi() or mean() or whatnot, it should work fine. Can you test
this?
> 
> I don't know if this explains your problems with rollapply as you don't
define
> what your function is in the code you share.
> 
> Does that work?
> 
> Michael
> 
> On Sun, Jan 29, 2012 at 11:50 PM, Ted Byers <r.ted.byers at gmail.com> wrote:
> > Hi Michael,
> >
> > Thanks for your help.
> >
> > Yes, your example works fine for me.
> >
> > Example data is a problem because of the quantity of data needed to
> > produce the problem.  I have, for example, a file of a few hundred
> > kilobyes, for tick data for less than a day, and the problem does not
> materialize with it.
> > I have another file of many tens of megabytes on which it invariably
> > happens.  I don't know how large a data file is needed to produce he
> > problem.  I really don't think you want me sending such a large
dataset...
> >
> > I have been investigating, and have progressed to a state where I am
> > not mixing types of time objects.
> >
> > One of the curiousities is that it does not affect the processing of
> > relatively tiny files, but I see it only on larger files (representing
> > 3 months of data)
> >
> > Now the steps I use to prepare my data are:
> >
> > x = read.table("quotes_M11.dat", header = FALSE, sep="\t", skip=0)
> > str(x)
> >
> > dt<-sprintf("%s %04d",x$V2,x$V4)
> > dt<-as.POSIXlt(dt,format="%Y-%m-%d %H%M") dt <- as.POSIXct(dt) y <-
> > data.frame(dt,x$V5)
> > colnames(y) <- c("tickdate","price")
> > z <- xts(y[,2],y[,1])
> > #alpha <- to.minutes3(z, OHLC=TRUE)
> > alpha <- to.minutes(z, OHLC=TRUE, drop.time=FALSE)
> > colnames(alpha) <- c("Open","High","Low","Close") #tseq <-
> > seq(start(alpha),end(alpha), by = 60) tseq <-
> > seq(start(alpha),end(alpha), by = "min") alpha <- na.approx(alpha,
> > xout = tseq)
> > head(alpha)
> > tail(alpha)
> >
> > The latter two calls produce:
> >
> >> head(alpha)
> >                        Open     High      Low   Close
> > 2011-03-10 00:00:00 10350.00 10365.00 10350.00 10360.0
> > 2011-03-10 00:01:00 10353.33 10363.33 10353.33 10360.0
> > 2011-03-10 00:02:00 10356.67 10361.67 10356.67 10360.0
> > 2011-03-10 00:03:00 10360.00 10360.00 10360.00 10360.0
> > 2011-03-10 00:04:00 10360.00 10360.00 10360.00 10360.0
> > 2011-03-10 00:05:00 10361.50 10361.50 10361.50 10361.5
> >> tail(alpha)
> >                    Open High  Low Close
> > 2011-06-08 23:51:00 9430 9430 9430  9430
> > 2011-06-08 23:52:00 9430 9430 9430  9430
> > 2011-06-08 23:53:00 9430 9430 9430  9430
> > 2011-06-08 23:54:00 9430 9430 9430  9430
> > 2011-06-08 23:55:00 9430 9430 9430  9430
> > 2011-06-08 23:56:00 9430 9430 9430  9430
> >
> >
> > Aas you can see, the data look fine.
> >
> > Alas, it seems neither rollapply nor apply.daily like it, yet.
> >
> >> tr <- rollapply(alpha,width=20,FUN=rollRegFun,by.column=FALSE,
> > align="right")
> > Error in if (b < 1e-07) { : missing value where TRUE/FALSE needed
> >
> > And the following:
> >
> > myfun <- function(d) {
> >  str(d)
> > }
> > apply.daily(alpha,myfun)
> >
> > Produces output right to the last day, but then dies (the output from
> > the last day processed successfully and the error):
> >
> > An ‘xts’ object from 2011-06-08 to 2011-06-08 23:56:00 containing:
> >  Data: num [1:1437, 1:4] 9435 9435 9440 9445 9450 ...
> >  - attr(*, "dimnames")=List of 2
> >  ..$ : NULL
> >  ..$ : chr [1:4] "Open" "High" "Low" "Close"
> >  Indexed by objects of class: [POSIXct,POSIXt] TZ:
> >  xts Attributes:
> >  NULL
> > Error in coredata.xts(x) : currently unsupported data type
> >>
> >
> > As you can see, all of the data from the last day for which there was
> > data in the file was processed (in this case by str()), but then it
> > looks like
> > apply.data() tries to apply myfun on data for the day after the last
> > day for which there is data, and not surprisingly doesn't find any.  I
> > gues, the questin becomes why does apply.daily try to go past the last
> > date in the data?
> >
> > I do hope that the problem I am seeing in rollapply and that I see
> > when using apply.daily are related, as that would mean that I fix one
> > and the other gets fixed.
> >
> > Any other ideas?
> >
> > Thanks
> >
> > Ted
> >
> >> -----Original Message-----
> >> From: R. Michael Weylandt [mailto:michael.weylandt at gmail.com]
> >> Sent: January-29-12 11:02 PM
> >> To: Ted Byers
> >> Cc: r-sig-finance at r-project.org
> >> Subject: Re: [R-SIG-Finance] troubles with apply.daily
> >>
> >> I'm not sure time() is very good for what you want to do. It's tied
> >> to R's
> > builtin ts
> >> class, which, and it pains me to say this about R, really isn't very
> >> good
> > (at least
> >> for finance-y things). I think all your problems come from that...
> >>
> >> Perhaps construct your new index sequence as:
> >>
> >> seq(start(alpha), end(alpha), by = "min")
> >>
> >> Since you didn't supply example data, let's try this (admittedly
> >> absurd) analysis to showcase how these techniques should work:
> >>
> >> library(quantmod)
> >> getSymbols("AAPL")
> >> AAPL <- Cl(AAPL)
> >>
> >> ## Force to have daily (including non-trading days) points AAPL <-
> >> na.approx(AAPL, xout = seq(start(AAPL), end(AAPL), by = "day"))
> >>
> >> # check that it worked
> >> head(AAPL, 20)
> >>
> >> # Now we aggregate to weekly
> >> AAPL.w <- to.weekly(AAPL)
> >>
> >> # and now we apply a function monthly apply.monthly(AAPL.w, max)
> >>
> >> So everything seems to be in order. Does this help?
> >>
> >> Michael
> >>
> >>
> >> On Sun, Jan 29, 2012 at 12:51 PM, Ted Byers <r.ted.byers at gmail.com>
> wrote:
> >> > I do not understand this, either to figure out the cause, let alone
> >> > the
> > fix.
> >> >
> >> >
> >> >
> >> > Here is what I tried:
> >> >
> >> >
> >> >
> >> > myfun <- function(d) {
> >> >
> >> >  str(d)
> >> >
> >> > }
> >> >
> >> > apply.daily(alpha,myfun)
> >> >
> >> >
> >> >
> >> > And here are what the beginning and end of alpha (an xts object
> >> > created by
> >> > to.minute()):
> >> >
> >> >
> >> >
> >> >> head(alpha)
> >> >
> >> >                        Open     High      Low Close
> >> >
> >> > 2011-03-10 00:00:00 10350.00 10365.00 10350.00 10360
> >> >
> >> > 2011-03-10 00:00:01 10350.06 10364.97 10350.06 10360
> >> >
> >> > 2011-03-10 00:00:02 10350.11 10364.94 10350.11 10360
> >> >
> >> > 2011-03-10 00:00:03 10350.17 10364.92 10350.17 10360
> >> >
> >> > 2011-03-10 00:00:04 10350.22 10364.89 10350.22 10360
> >> >
> >> > 2011-03-10 00:00:05 10350.28 10364.86 10350.28 10360
> >> >
> >> >> tail(alpha)
> >> >
> >> >                    Open High  Low Close
> >> >
> >> > 2011-06-08 23:55:55 9430 9430 9430  9430
> >> >
> >> > 2011-06-08 23:55:56 9430 9430 9430  9430
> >> >
> >> > 2011-06-08 23:55:57 9430 9430 9430  9430
> >> >
> >> > 2011-06-08 23:55:58 9430 9430 9430  9430
> >> >
> >> > 2011-06-08 23:55:59 9430 9430 9430  9430
> >> >
> >> > 2011-06-08 23:56:00 9430 9430 9430  9430
> >> >
> >> >>
> >> >
> >> >
> >> >
> >> > There is almost three months of tick data here, converted to one
> >> > minute OHLC data.
> >> >
> >> >
> >> >
> >> > I had apparently successfully used the following to ensure I had an
> >> > even time series with values for every minute from start to end:
> >> >
> >> >
> >> >
> >> > tseq <- seq(start(alpha),end(alpha), by = time("00:01:00"))
> >> >
> >> > alpha <- na.approx(alpha, xout = tseq)
> >> >
> >> >
> >> >
> >> > But there is something weird here.  How is it that alpha appears to
> >> > have rows for every second from the start to the end, rather than
> >> > 'just' for every minute?
> >> >
> >> >
> >> >
> >> > Now here is what the output looks like:
> >> >
> >> >
> >> >
> >> > An 'xts' object from 2011-06-05 to 2011-06-05 23:59:59 containing:
> >> >
> >> >  Data: num [1:86400, 1:4] 9437 9437 9437 9437 9437 ...
> >> >
> >> > - attr(*, "dimnames")=List of 2
> >> >
> >> >  ..$ : NULL
> >> >
> >> >  ..$ : chr [1:4] "Open" "High" "Low" "Close"
> >> >
> >> >  Indexed by objects of class: [POSIXct,POSIXt] TZ:
> >> >
> >> >  xts Attributes:
> >> >
> >> >  NULL
> >> >
> >> > An 'xts' object from 2011-06-06 to 2011-06-06 23:59:59 containing:
> >> >
> >> >  Data: num [1:86400, 1:4] 9420 9420 9420 9420 9420 ...
> >> >
> >> > - attr(*, "dimnames")=List of 2
> >> >
> >> >  ..$ : NULL
> >> >
> >> >  ..$ : chr [1:4] "Open" "High" "Low" "Close"
> >> >
> >> >  Indexed by objects of class: [POSIXct,POSIXt] TZ:
> >> >
> >> >  xts Attributes:
> >> >
> >> >  NULL
> >> >
> >> > An 'xts' object from 2011-06-07 to 2011-06-07 23:59:59 containing:
> >> >
> >> >  Data: num [1:86400, 1:4] 9428 9428 9428 9428 9428 ...
> >> >
> >> > - attr(*, "dimnames")=List of 2
> >> >
> >> >  ..$ : NULL
> >> >
> >> >  ..$ : chr [1:4] "Open" "High" "Low" "Close"
> >> >
> >> >  Indexed by objects of class: [POSIXct,POSIXt] TZ:
> >> >
> >> >  xts Attributes:
> >> >
> >> >  NULL
> >> >
> >> > An 'xts' object from 2011-06-08 to 2011-06-08 23:56:00 containing:
> >> >
> >> >  Data: num [1:86161, 1:4] 9435 9435 9435 9435 9435 ...
> >> >
> >> > - attr(*, "dimnames")=List of 2
> >> >
> >> >  ..$ : NULL
> >> >
> >> >  ..$ : chr [1:4] "Open" "High" "Low" "Close"
> >> >
> >> >  Indexed by objects of class: [POSIXct,POSIXt] TZ:
> >> >
> >> >  xts Attributes:
> >> >
> >> >  NULL
> >> >
> >> > Error in coredata.xts(x) : currently unsupported data type
> >> >
> >> >
> >> >
> >> > Now, I do not understand what is happening here.  The data seem
> >> > consistent throughout, so why would it crash and burn on the very
> >> > last day, and only on that day, of the three months of data
> >> >
> >> >
> >> >
> >> > Any insight would be greatly appreciated.
> >> >
> >> >
> >> >
> >> > Thanks
> >> >
> >> >
> >> >
> >> > Ted
> >> >
> >> >
> >> >        [[alternative HTML version deleted]]
> >> >
> >> > _______________________________________________
> >> > R-SIG-Finance at r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> >> > -- Subscriber-posting only. If you want to post, subscribe first.
> >> > -- Also note that this is not the r-help list where general R
> >> > questions
> > should
> >> go.
> >



More information about the R-SIG-Finance mailing list