# [R-SIG-Finance] high frequency data analysis in R

Dale W.R. Rosenthal daler at uic.edu
Fri May 22 15:59:40 CEST 2009

Having looked a lot at high-frequency data, let me make a few
corrections here.

1) True, you have no transaction prices outside of when trades happen.
You could impute prices; but, that would also need to account for the
downward bias when an instrument has not traded for a while.  Also, you
cannot trust trade timestamps 100%; there is some publishing delay.
(The usual reasoning being that market makers need time to hedge after
getting hit/lifted.)

2) You do always have the NBBO and maybe even the book.  Better still,
you can probably trust quote timestamps.

3) Using these two data streams together requires caution.  Trades are
published with delay; and, while the delay might be small, it is not
small relative to the number of quote changes.  Therefore, you have
NBBOs and you have trade prices; but, matching them up is not
straightforward.  Worse:  There is endogeneity that can creep in.  A
trade may induce a change in quotes.  Comparing the trade price to
quotes after the trade occurred will bias any comparison.

If you want to read up on a way to handle that delay, look at sections 3
and 4 of my trade direction paper at
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1032701.  The basic
idea: you might want to average quotes before the trade using something
like a gamma distribution for the delay -- so you have \hat{q}_t =
\int_0^T GammaCDF(s) q_{t-s} ds.

You can also find code to do this at http://tigger.uic.edu/~daler/code.html

Just a few thoughts since we seem to be drifting toward mixing quotes

Dale

> Message: 14
> Date: Thu, 21 May 2009 13:48:45 -0400
> From: Eugene Tyurin <etyurin at skipstonellc.com>
> Subject: Re: [R-SIG-Finance] high frequency data analysis in R
> To: Michael <comtech.usa at gmail.com>, r-sig-finance at stat.math.ethz.ch
> High-frequency is not my specialty either, but a quote caught my
> attention:
> On Thu, May 21, 2009 at 11:38 AM, Michael <comtech.usa at gmail.com> wrote:
>> > My data are price change arrivals, irregularly spaced. But when there
>> > is no price change, the price stays constant. Therefore, in fact, at
>> > any time instant, you give me a time, I can give you the price at that
>> > very instant of time. So irregularly spaced data can be easily sampled
>> > to be regularly spaced data.
>>
> >From a trader's perspective, you do not have "the price" at any time
> outside of the instant a trade took place - you have NBBO (and market
> depth). Last trade's price may or may not be transactable again on
> either long or short side. You can alternatively say that you have an