[R-SIG-Finance] Aggregating tick data

Roupell, Darko Darko.Roupell at cba.com.au
Mon Nov 14 04:11:31 CET 2011


This should help you to deal with time stamp issue and allow you to aggregate tic data from point of your interest. This is modified version of source code in RTAQ, so I would strongly suggest you to get familiar with that package before you proceed with implementation.

aggregatets = function (agdata, FUN = NULL, on = "minutes", k = 1, weights = NULL,dropna=FALSE) 
{
    makethispartbetter = ((!is.null(weights))| on=="days"|on=="weeks"| (FUN!="previoustick")|dropna);
    if(makethispartbetter)  {
      
      FUN = match.fun(FUN)
      
      if (is.null(weights)) {
          ep = endpoints(agdata, on, k)
          if(dim(agdata)[2]==1){ ts2 = period.apply(agdata, ep, FUN) }
          if(dim(agdata)[2]>1){  ts2 = xts(apply(agdata,2,FUN=period.apply2,FUN2=FUN,INDEX=ep),order.by=index(agdata)[ep],)}
      }
      if (!is.null(weights)) {
          tsb = cbind(agdata, weights)
          ep = endpoints(tsb, on, k)
          ts2 = period.apply(tsb, ep, FUN = match.fun(weightedaverage) )
      }
      if (on == "minutes" | on == "mins" | on == "secs" | on == "seconds") {
          if (on == "minutes" | on == "mins") {
              secs = k * 60
          }
          if (on == "secs" | on == "seconds") {
              secs = k
          }
          a = .index(ts2) + (secs - .index(ts2)%%secs)
          ts3 = .xts(ts2, a,tz="GMT")
      }
      if (on == "hours") {
          secs = 3600
          a = .index(ts2) + (secs - .index(ts2)%%secs)
          ts3 = .xts(ts2, a,tz="GMT")
      }
      if (on == "days") {
          secs = 24 * 3600
          a = .index(ts2) + (secs - .index(ts2)%%secs) - (24 * 
              3600)
          ts3 = .xts(ts2, a,tz="GMT")
      }
      if (on == "weeks") {
          secs = 24 * 3600 * 7
          a = (.index(ts2) + (secs - (.index(ts2) + (3L * 86400L))%%secs)) - 
              (24 * 3600)
          ts3 = .xts(ts2, a,tz="GMT")
      }
      #if (FUN="sumN") {
      #  ep = endpoints(agdata, on, k)
      #  ts2 = period.apply(agdata, ep, FUN = match.fun("sumN") )
      #}
      
      #if (FUN!="previoustick") {
      #  ep = endpoints(agdata, on, k)
      #  ts2 = period.apply(agdata, ep, FUN = match.fun("sumN") )
      #}
      
      if (!dropna) {
          if (on != "weeks" | on != "days") {
              if (on == "secs" | on == "seconds") {
                  tby = "s"
              }
              if (on == "mins" | on == "minutes") {
                  tby = "min"
              }
              if (on == "hours") {
                  tby = "h"
              }
              by = paste(k, tby, sep = " ")
              allindex = as.timeDate(seq(start(ts3), end(ts3), 
                  by = by))
              xx = xts(rep("1", length(allindex)), order.by = allindex)
              ts3 = merge(ts3, xx)[, (1:dim(agdata)[2])]
          }
      }
      
      index(ts3) = as.timeDate(index(ts3));
      return(ts3);
    }
    
    if(!makethispartbetter){
     if (on == "secs" | on == "seconds") { secs = k; tby = paste(k,"sec",sep=" ")}
     if (on == "mins" | on == "minutes") { secs = 60*k; tby = paste(60*k,"sec",sep=" ")}
     if (on == "hours") {secs = 3600*k; tby = paste(3600*k,"sec",sep=" ")}
    
      FUN = match.fun(FUN);
      
      g = seq(start(agdata), end(agdata), by = tby);
      rawg = as.numeric(as.POSIXct(g,tz="GMT"));
      newg = rawg + (secs - rawg%%secs);
      g = as.timeDate(as.POSIXct(newg,origin="1970-01-01",tz="GMT"));
      ts3 = na.locf(merge(agdata, zoo(, g)))[as.POSIXct(g,tz="GMT")]; 
      return(ts3) 
    }
}

#PRICE (specificity: opening price and previoustick)

aggregatePrice = function (pricedata, FUN = "previoustick", on = "minutes", k = 1,marketopen="10:00:00",marketclose = "16:00:00") 
{

    pricedata = dataformatc(pricedata)
    aggpdata = aggregatets(pricedata, FUN = FUN, on, k)
    lastdaydate = strsplit(as.character(index(pricedata)), " ")[[1]][1]

	#open
    openingtime= as.timeDate(paste(lastdaydate, marketopen))
    getopenprice = as.xts(matrix(as.numeric( pricedata[1]),nrow=1), openingtime)
    aggpwithopen = c(getopenprice, aggpdata)

	#close
    closetime = as.timeDate(paste( lastdaydate, marketclose))
    condition = index(aggpwithopen) < closetime
    aggpwithopen = aggpwithopen[condition]
    getcloseprice = as.xts(matrix(as.numeric(last(pricedata)),nrow=1), closetime)
    aggpwithclose = c(aggpwithopen, getcloseprice)

    return(aggpwithclose)
}

__________________________________________________
Commonwealth Bank 
Darko Roupell 
Associate Quantitative Analyst 
Institutional Banking & Markets
Equities Research
Darling Park Tower 1
Level 23, 201 Sussex Street
Sydney, NSW 200
P:  +61 2 9117 1254
F:  +61 2 9118 1000
M: +61 400 170 515
E: Darko.Roupell at cba.com.au 
Our vision is to be Australia's finest financial services organisation through excelling in customer service.

Email Security
This email is sent solely for informational purposes. Hoax emails, commonly referred to as phishing, can appear to be from the Commonwealth Bank and ask you to update or confirm details such as client numbers, passwords, personal identification questions, contact details or account numbers. The Commonwealth Bank will never send you an email asking you to confirm, update or reveal your confidential banking information. 
Important Information
Produced by Global Markets Research, a business unit of Commonwealth Bank of Australia ABN 48 123 123 124 - AFSL 234945 (Commonwealth Bank). This publication is based on information available at the time of publishing.  We believe that the information in this communication is correct and any opinions, conclusions or recommendations are reasonably held or made as at the time of its compilation, but no warranty is made as to accuracy, reliability or completeness.  To the extent permitted by law, neither Commonwealth Bank nor any of its subsidiaries accept liability to any person for loss or damage arising from the use of this communication. This communication does not purport to be a complete statement or summary.  
The information provided has been prepared without considering your objectives, financial situation or needs, and before acting on the information, you should consider its appropriateness to your circumstances. No person should act on the basis of this report without considering and if necessary taking appropriate professional advice upon their own particular circumstances.  
Commonwealth Bank of Australia, as a provider of investment, borrowing and other financial services undertakes financial transactions with many corporate entities in Australia. This may include any corporate issuer referred to in this communication. Commonwealth Bank and its subsidiaries have effected or may effect transactions for their own account in any investments or related investments referred to herein. In the case of certain securities Commonwealth Bank is or may be the only market maker.


-----Original Message-----
From: r-sig-finance-bounces at r-project.org [mailto:r-sig-finance-bounces at r-project.org] On Behalf Of Matthew Gilbert
Sent: Monday, 14 November 2011 2:02 PM
To: r-sig-finance at r-project.org
Subject: [R-SIG-Finance] Aggregating tick data

Hi,

I have tick data for trade durations between 9:30 and 4:00 over the 
course of a month which I want to aggregate into 5 minute bins, starting 
from 9:30. What I currently have tried is

Durations.5min = 
period.apply(Data[,'Duration'],endpoints(Data[,'Duration'],on='minutes',k=5),FUN=mean)

however this returns bins starting from the first trade occurring on 
that day, and therefore my bins are not all the same over the course of 
the month.

Any suggestions are much appreciated.

Thanks,
Matt

-- 
Matthew Gilbert
Master of Quantitative Finance Candidate
University of Waterloo

_______________________________________________
R-SIG-Finance at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.

************** IMPORTANT MESSAGE *****************************       
This e-mail message is intended only for the addressee(s) and contains information which may be
confidential. 
If you are not the intended recipient please advise the sender by return email, do not use or
disclose the contents, and delete the message and any attachments from your system. Unless
specifically indicated, this email does not constitute formal advice or commitment by the sender
or the Commonwealth Bank of Australia (ABN 48 123 123 124) or its subsidiaries. 
We can be contacted through our web site: commbank.com.au. 
If you no longer wish to receive commercial electronic messages from us, please reply to this
e-mail by typing Unsubscribe in the subject line. 



More information about the R-SIG-Finance mailing list