[R-SIG-Finance] Estimating volume at price for backtest data bars?

Mark Knecht markknecht at gmail.com
Wed Apr 7 16:25:12 CEST 2010

On Wed, Apr 7, 2010 at 5:13 AM,  <markleeds at verizon.net> wrote:
> hi mark: if i understand your question ( not sure that I do ). google for
> lee ready algorithm.
> On Apr 7, 2010, Brian G. Peterson <brian at braverock.com> wrote:
> Mark Knecht wrote:
>> Hi,
>> I wonder if anyone has seen anything written publicly about
>> estimating volume at price within backtest bars where you don't have
>> any data other than up/down volume for the bar? For instance, when
>> looking at 1 minute ES bars they are, on average maybe 6-8 price ticks
>> tall. How might one estimate where the volume was in the bar for
>> upticks and downticks? Clearly there's no was to be completely
>> accurate, but maybe it's better than doing nothing.
>> I'm playing with using sort of a skewed normal distribution which is
>> interesting, but it doesn't take open and close into account. When I
>> look at real data there's often a bit more up volume near the top when
>> the price is trending strongly up, and more to the downside when the
>> price is trending down. I could use other indicators for simple
>> estimates of trend, and then do something based on that.
>> Anyway, I'm just interested in reading something on the subject, if
>> there's anything out there, before messing with R code.
> If I get some time, I'll dig through my collection of papers, because there
> is
> literature on matching up Buy-driven versus sell-driven volume.
> However, all this literature is on tick data. You will likely need to
> acquire
> a source for tick data before you can have any accuracy at all in any
> resulting
> estimates.
> - Brian

Brian & Mark,
   Thanks for your responses. Hopefully folks don't consider this
off-topic as I haven't posted any R code. If so we can cease the

   On TradeStation I have tick data down to 1 tick. The problem is
that it only goes back 6 months and I need to look at system
performance much further back - say 10-20 years - where I only have
minute or larger data and want to estimate what the tick data might
have told me if I had it. The key word here is 'estimate'.

   Here's a bit more explanation about what I'm thinking about:

1) Assume I'm looking at bars for some instrument and that a certain
bar is maybe 9 ticks tall. (When talking about price I use the word
'tick' to mean each minimum price move within that bar. For the ES min
move is 0.25, so each 0.25 is one 'tick' vertically.) The 9 tick bar
goes from 998 to 1000.

2) Assume for discussion that within this bar there was up volume of
around 15000 contracts and down volume of around 7000 contracts and
that in general this bar happened a long time ago so I don't have tick
data. How might I __estimate__ how much volume occurred at each min
price move tick within the bar? Visually in the diagram below (which
likely won't survive email very well) I use a symmetrical sort-of
Fibonacci based distribution of the volume:

Dn = Down volume
Up = Up volume

Dn         Price         Up
 0         1000.00     800
500       0999.75    1300
800       0999.50     2100
1300     0999.25     3400
2100     0999.00     3400
1300     0998.75     2100
800       0998.50    1300
500       0998.25    800
300       0998.00    0

   In general there won't be any down volume at the highest price in
the bar (upper left) and there won't be any up volume at the lowest
price in the bar. (Lower right)

   What I'm interested in is means of estimating what volume *might*
have traded at each price level based on volume of the bar, relative
volume of the bar to previous or subsequent bars, time of day, type of
bar (i.e. - candlestick type for instance) Basically use any criteria
you want and come up with a way to assign volume to price in a
non-arbitrary and (hopefully) more or less realistic way.

   Hopefully this would clarify what I'm looking for. The R code
likely won't be hard at all, depending on how much math is required
and what sort of accuracy someone is looking for. In my case it really
only needs to be a rough estimate as I don't really care exactly what
happens in any single bar and really only that over lot (say 100's) of
bars I get a reasonably clear picture of volume at any given price.


More information about the R-SIG-Finance mailing list